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Abstract 

We give an overview of the ideas central to some recent developments in 
the ergodic theory of the stochastically forced Navier Stokes equations and 
other dissipative stochastic partial differential equations. Since our desire is 
to make the core ideas clear, we will mostly work with a specific example: 
the stochastically forced Navier Stokes equations. To further clarify ideas, we 
will also examine in detail a toy problem. A few general theorems are given. 
Spatial regularity, ergodicity, exponential mixing, coupling for a SPDE, and 
hypoellipticity are all discussed. 



This article attempts to collect a number of ideas which have proven useful 
in the study of stochastically forced dissipative partial differential equations. The 
discussion will center around those of ergodicity but will also touch on the regularity 
of both solutions and transition densities. Since our desire is to make the core 
ideas clear, we will mostly work with a specific example: the stochastically forced 
Navier Stokes equations. To further clarify ideas, we will also examine in detail 
a toy problem. Though we have not tried to give any great generality, we also 
present a number of abstract results to help isolate what assumptions are used in 
which arguments. Though a few results are presented in new ways and a number 
of proofs are streamlined, the core ideas remain more or less the same as in the 
originally cited papers. We do improve sightly the exponential mixing results given 
in [Mat02c]; however, the techniques used are the same. Lastly, we do not claim to 
be exhaustive. This is not meant to be an all encompassing review article. The view 
point given here is a personal one; nonetheless, citations are given to good starting 
points for related works both by the author and others. 

Consider the two-dimensional Navier-Stokes equation with stochastic forcing: 

du , „ N . dW(x,t) 

_ + (u .V)« + VP = „A M+ ^i (1) 

V • u = 
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We restrict to the 27r-periodic case with mean flow zero, though many of the results 
apply equally to bounded domains with Dirichlet boundary conditions. The addition 
of a stochastic forcing can be motivated by a number of considerations. Since the 
Navier-Stokes equations are dissipative, if there is no external forcing, the system 
relaxes to the zero state where the fluid is at rest. Hence, if one is interested in 
probing the nonlinear dynamics, some forcing is necessary. Stochastic forcing is 
often proposed, particularly in the study of turbulent fluid flows, as a way to add a 
"generic" forcing. Generic is then interpreted in the sense of the typical events in 
probability space. 

We will take the forcing to be the sum of independent Brownian motions exciting 
independent Fourier modes. This is convenient because one of our long term goals 
is to understand the interaction between the different scales and the differences of 
the dynamics at different scales. Specifically we set 



where K C Z 2 does not contain the zero wave number ensuring that the spatial 
mean stays zero. The (3k = ^75 (/^ + ^i^) where the (3® are mean zero, variance 
one Brownian Motions independent except for the reality condition f3k = f3-k- The 
Ofc G C are constants used to set the spatial roughness of the flow. They also 
satisfy the reality condition = <j-k- We make the standing assumption that 
So = l°"fc| 2 < 00 an d define a 2 = max \ak\ 2 . Similarly if £ a = ^2 \<Jk\ 2 \k\ 2a < 00 
then for every t, W( ■ , t) is almost surely in the Sobolev space H a (T 2 ) x H a (T 2 ). 
Here T 2 is the two dimensional torus. If the |<7fc| decay exponentially or faster, the 
forcing field is analytic is space almost surely. 

In the next section, we continue with the setup. In section 2, we briefly dis- 
cuss invariant measures. In section 3, we discuss how the structure of the solution 
changes for different choices of forcing. In particular, we discuss the spatial smooth- 
ness. In sections 4, 5, and 6, we highlight some of the difficulties with ergodic theory 
in infinite dimensions. In section 7, we discuss ergodicity of the stochastically forced 
Navier Stokes (SNS) equations under various assumptions, including the ideas of 
"effective ellipticity" and the reduction to Gibbsian dynamics (dynamics with mem- 
ory). In section 8, we formulate the results in a more general setting and examine a 
toy model to highlight the main ideas. In sections 9 and 10, we discuss the contrac- 
tive nature of the SNS equations and the fluctuations of its energy and enstrophy. 
In section 13, we discuss the Lyapunov structure and localization in the general 
setting. In section 15, we prove a general exponential mixing result using a non- 
Markovian coupling argument. In section 16, we discuss some other systems where 
the discussed methods apply. In section 17, we give a number of partial results in 
the setting where the previously stated ergodic theorems do not hold. Lastly in 
section 18, we list a few open questions. 



It is convenient to project (1) onto the space of divergence free vector fields 
thereby removing the pressure, which is just a Lagrange multiplier enforcing the 





1. The Setting 
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divergence free constraint. To this end, L 2 will denote the closure in the L 2 topology 
of divergence free, mean zero, C°° vector fields on the two dimensional torus T 2 . 
Similarly the Sobolev space HP is defined as L 2 except that the closure is taken in 
H a (T 2 ) x H a (T 2 ). Projecting equation (1) onto L 2 produces the stochastic evolution 
equation 

+ = ^ (3) 

where B(u,v) = Pdiviu ■ V)i>, A 2 u = —Pd iv Au and P^ v is the projection operator 
onto the space of divergence free vector fields. 

To better elucidate some of the structure of (3), it is useful at times to consider 
the equation for the vorticity uj(x,t) = — |^ written in Fourier Space. Notice 
that in two dimensions u is a scalar quantity. Setting u(x, t) = J2 k uj k {t)e lk ' x , one 
obtains the infinite system of coupled diffusions 

e+j=k I I 

Unlike many lattices of interacting diffusions, this system in not invariant under 
translations in the lattice index k G Z 2 . In fact for large \k\ the linear term in (4) 
dominates the other drift term which couples the modes together. This observation 
is at the heart of all that follows. It gives rise to the dissipative nature of the 
dynamics. 

Since the noise is additive in our model problem, it is completely standard that 
there exists a stochastic flow which depends continuously on both the initial data 
and the noise realization W considered as an element of the probability space Q = 
C((— oo, oo); IR 2 '^'). To complete the picture, we work on (Q, P, J 7 , Tt). Here f2, 
as just defined, is the path space of the Brownian trajectories, P is the Weiner 
measure on this space, T is the associated sigma algebra, and T t and J-[ 8 ,t\ are 
the nitrations containing the information of the noise increments up to time t and 
between time s and t respectively. We will at times write (p s j(W)uo or tpYt u o f° r 
u(x, t,W) with u(s) = u and tpt(W) for ip 0jt (W). The notation U[ S)t ] means the 
segment of trajectory on [s,t]. We will write E to denote expectation with respect 
to the probability measure P; that is EF(W) = f Q F(W)F(dW). At times we will 
speak of solutions existing on the time interval (— oo, oo). By this we mean a measure 
Pondx C(— oo,oo;X) so that the following holds for almost every (W,u): W is 
distributed as a Weiner process, u(t) is adapted to the filtration generated by W(s) 
with s < t, and the pair (W, u) solves the integral form of (3) over any finite time 
interval. 



2. Ergodicity and Invariant Measures 

When investigating a stochastically forced system, such as the stochastically 
forced Navier Stokes equation (SNS), the main interest is often the behavior and 
structure of the system once it has forgotten its initial condition. In other words, 
we are interested in the behavior of the system in its statistical steady state. The 
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statistical steady states of a system are described by its invariant measure. In our 
setting, a measure ji on L 2 is invariant under the dynamics if for any t > and 
Borel set AcL 2 

fi{u :u e A} = / fi{u : u(t, W; u ) G A}¥(dW) = Efi{u : <p t (W)u G A} . 
Jn 

A system is uniquely ergodic, or simply ergodic, if there is only one such invariant 
measure. The Birkoff ergodic theorem (cf. [Sin94]) guarantees that for any bounded 
function / : L 2 -> R 

)- [ f(u(t,W;u j)dt-^f(u )= [ f(u)d/i uo (u) 

if Uq is a typical point for some invariant measure /i uo . We have labeled the invariant 
measure with the initial point u to emphasize that different points might converge 
to different /, each the average of / against a different invariant measure. However, 
if the system is ergodic then there is only one such invariant measure and the time 
average / is independent of the initial condition. Hence, the statistics of almost 
every trajectory will converge to a unique common distribution. Implying that the 
statistics of the systems asymptotic behavior is insensitive to the initial condition. 



3. The Form of the Forcing 

Consider the two classes of forcing distinguished by whether |/C|<ooor|/C|=oo. 
The first class is the most natural from both the point of view of turbulence theory 
and that of exploring the nonlinear dynamics of the Navier-Stokes equations. In that 
case, one wants to force the equations at some scale, usually at large or intermediate 
scales, and then observe the transfer of energy and enstropy up and down scale. 
Generally, forcing which excites all the Fourier modes (/C = Z 2 ) is the first case 
studied for a given stochastic partial differential equation (SPDEs). This was true 
of the SNS (cf. [FM95, Fer97, DPZ96]). In these investigations, the forcing was 
assumed to be spatially rough; essentially |<7fc| ~ \k\~ a - This assumption means 
that the forcing is not analytic in space. The requirement of rough forcing appears 
to not simply be a technical assumption, and the methods from [FM95, Fer97, 
DPZ96] do not seem to work in other elliptic cases. It is important to mention 
that the qualitative behavior of the system appears to be quite different depending 
on whether the magnitude of the modes decays at least exponentially or simply 
algebraically. 

Consider the following two theorems proven respectively in [MS03] and [Mat02c]. 
The first theorem compares the vorticity equation to the associated linear stochastic 
heat equation. This equation is just the Ornstein-Uhlenbeck process 




dt 



(5) 



If z(x,t) = J^fcez z k(t) exp(i/c • x) then (5) becomes 
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The following theorem states that at small scales z and u are quite similar, even 
path wise if the forcing decays algebraically in the spatial Fourier modes. 

Theorem 1. Assume that c\k\~ a < \ak\ < C\k\~ a for some positive constants. Let 
uj' k = j^^k and z' k = j^Zk- For any uniformly continuous, bounded function F on 
C([0, 1]; R d ), E\F(u' kl Z,u;' kd ) - F(z' ki ,..., z' kd )\ - as k u k d - oo. [MS03J 

Theorem 1 says that when the forcing decays algebraically in the magnitude of 
the wave number k, then so does the solution. In fact, at small scales, it is pathwise 
a perturbation of (5) in some sense. Hence, the nonlinearity is really secondary in 
setting the infinite dimensional character of the problem. 

The second theorem covers the case when the forcing decays at least exponen- 
tially fast and, in particular, covers the case when only a finite number of modes are 
forced. Earlier versions of this theorem were proven in [Mat98, MS99] and all of the 
versions build on deterministic versions which date back at least to [FT89] and are 
informed by later works such as [L097, DT95, DG95, OT00]. In [BKLOO, Shi02], 
yet different formulations of Theorem 2 are given and proven. The second reference 
seems to give the best scaling with viscosity, while the version below gives explicit, 
eventually stationary processes which bound the quantities of interest. 

Theorem 2. If there exist positive constants (3 andC so jcr^l < Ce~^ then for any 
initial u(0) G L 2 there exist two stochastic processes r(t,W) and D(t,W), positive 
for t > 0, so that 

\u k (t, W)\ < D(t, W)e~ T{t ' wm W -almost surely for allt>0 

and such that lim t ^ 00 Er(t) G [ci,Ci] and lim t ^ 00 'ED(t) G [c 2 ,C 2 ] where Ci and C; L 
are positive constants which depend on the structure of the forcing but not on the 
initial data u(0). (For the form of the equations for r and D and information about 
their moments see [Mat02b].) 

Though no lower bound on \uk\, as \k\ — > oo, has been proven, there is strong 
evidence that this is the correct order. Even when the forcing decays faster than 
exponential, there is no evidence that the solution does. It is interesting to note 
that in all of the current estimates of the decay rate fluctuate in time. Whether this 
is correct is not clear. It is a little surprising that even when only a few modes are 
forced that r(t) does not converge to a constant as t — > oo. 

Comparing Theorem 1 and 2, one sees that there is a strong qualitative difference 
between the two cases. In the first, the forcing sets the small scale structure. In the 
second, the forcing seems to be dictated by the nonlinear dynamics. 

4. The Difficulty of Infinite Dimensions 

It is reasonable to ask why the ergodic theory of stochastically forced PDEs is 
more complicated than that of finite dimensional SDEs. A basic problem is that 
there is no single distinguished topology associated with most infinite dimensional 
diffusions. Since all topologies are not equivalent, if one wants to write the transition 
density one must use exactly the right base measure. This means one must know 
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exactly the natural topology of the problem. This is underlined by the following 
simple example. Consider two SPDEs of the form (5) with \a k \ = \k\~ a in one case 
and \<7k\ = \k\~ a+e in the other. These two process induce measures on the phase 
space L 2 which are mutually singular at any positive moment of time, even if they 
start from the same point. 

In general, getting the correct topology is a very delicate matter. There seems to 
be no good general tool to address this class of problems. In the setting of Theorem 
1, one strongly suspects that the measure induced by the SNS at a moment of time 
t is equivalent to that induced by (5). However, even in this case, equivalence has 
only been proven when the Laplacian is replaced by A 2+e . For this "hyperviscous" 
problem, the equivalence is proven in [MS03]. 

5. Diffusions, Ellipticity, and Hypoellipticity 

Just as an ordinary SDE is associated with a PDE which evolves its density, 
one can association with an SPDE a "diffusion" on a larger space which evolves 
the probability transition density. In some cases this can be made rigorous (cf. 
[DPZ92, FG98, DPZ02], ). Formally, consider the "diffusion" on R z2xZ2 associated 
with the stochastic process (4). Writing z k = x k + iy k , the backward Kolmogorov 
equation would be 

^U({x k }, {y k }, t) = £U({x k }, {y k }, t) (7) 
U({x k }, {y k },0) = U ({x k }, {y k }) 

where Uq : M. z2xz2 — > R is the initial condition. By {x k } we mean the collection 
{x k : k E Z 2 }. The differential operator C is 

where 

F k = -v\k\ 2 z k + i 22 ~\u^~ Ze z i ■ 
t+j=k I I 

The case when /C = Z 2 , corresponds to the elliptic setting. If \K\ < oo, then the 
operator C is degenerate to leading order in all but a finite number of coordinates. 
Even the case /C ^ Z 2 but \)C\ = oo, it is still degenerate. In either of the last cases, 
the ergodic theorems stated previously are surprising in the sense that they imply 
some sort of ellipticity without requiring the detailed geometric information needed 
to verify hypoellipticity. These ideas will be elaborated upon in section 17. 

6. Ergodicity with Elliptic, Rough Forcing 

In [FM95, Fer97] ergodicity is proven under the assumption, translated to our 
setting, that c|/c|~ Q < \a k \ < C\k\~ a for some positive constants. The proof of ergod- 
icity relies on the Bismuth-Elworthy-Li formula and seems to fundamentally require 
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an elliptic diffusion with algebraically decaying spectrum. In light of Theorem 1, 
it is tempting to characterize the system in this regime as a perturbation of the 
linear process since the linear process sets the small scale structure. Eckmann and 
Hairer [EH01] showed that finite dimensional Malliavin calculus could be combined 
with the type of analysis used in [Cer99, FM95, Fer97] to show that a stochastically 
forced SPDE was ergodic even if a finite number of the directions with possibly pos- 
itive Lyapunov exponents were not forced. They required a bracket condition in the 
spirit of Horomander's "sum of squares theorem" (see section 17). Unfortunately 
they still required rough (algebraically decaying) forcing. 



7. Ergodicity under an Effective Ellipticity Assumption 

We now turn to a number of results which allow one to prove ergodicity despite 
the fact that K ^ Z 2 . In particular, no lower bound will be placed on the decay 
rate of the |<7fc|; even |/C| < oo will be allowed if other assumptions are satisfied. 
Recalling that £ = J2 k |<7fc| 2 , we have the following theorem. 

Theorem 3. There exists a fixed constant C depending only on the domain so that 
the following hold: 

• If C% < \ then (3) has a unique L 2 -valued invariant probability measure 
regardless of the structure of the forcing. [Mat98, Mat99] 

• If |<jfe| > for all k with \k\ 2 G (0,C%) ; then (3) has a unique h 2 -valued 
invariant probability measure. [EMS01, BKL01] 

By a L 2 -valued probability measure, we mean a measure n such that /i(L 2 ) = 1. 
The existence was given in [VF88, Fla94] in the case of the SNS and in a more 
general setting in [CK97]. Both results of Theorem 3 stem from the following fact 
first proven in the stochastic setting in [Mat98] but closely related to ideas in [FP67, 
Tem95, CFNT89, FST88]. Contemporaneously to [EMS01] similar techniques were 
used in [KSOO] , to prove a similar theorem for impulsive or "kicked" forcing. Though 
these initial results applied only for bounded forced, those authors later extended 
them to cover unbounded forcing. They also proved a convergence theorem of the 
kicked case to the white in time case. For the remainder of the discussion of the SNS, 
we fix a positive iV*. Let ILj be the orthogonal projection onto the space spanned 
by the wave numbers k with \k\ < N and let Ilh be the complimentary orthogonal 
projection. We consider the "high mode" equation on n^L 2 given by 

^jM + uA 2 h{Xj t) + UhB{h +i, h+i) = (8 ) 

where £ is a given "low mode" trajectory in II^L 2 and rj = Tl h W(x, t). We will denote 
by $^(£[ Si tj; h ) the solution to (8) at time t with initial condition h at time s and 
the given external forcings £ and 7] over the time interval [s, t] . A more quantitative 
version of the following result is given in Lemma 13.1. 

Theorem 4 (Foias and Prodi'67, Mattingly'98). Let C be the same constant 
as in Theorem 3. Assuming that N 2 > C% ; there exists a positive constant^ so the 
following two statements hold. 
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• Let u(x,t,W) be a solution to (3) on the time interval [0, oo). Define £(t) = 
U.£u(t) and r](t) = UhW(t). For almost every W , there exists a positive con- 
stant T = T(W, u(0)) so that for allt>T and h G U h lL 2 

\®oAo,t];h )-n h u(t,W)l 2 < \h -U h u(0)\ h2 e-* . 

In particular, if u(x,t,W) is another solution on [0, oo) and Qq C Q x Q 
such that for all (W, W) G VL and t G [0, oo) one has Tleu(t, W) = U£u(t, W) 
and U h W(t) - IW(0) = U h W(t) - IW(0) then u(t, W) = u(t, W) for all 
t G [0, oo) and almost every (W, W) G VLq. 

• Let u(x,t,W) be a stationary solution to (3) on the time interval (—00,00). 
Define £(t,W) = H£u(t,W) and r)(t) = UhW(t). Then with probability one, 
there exists a positive constant C depending only the solution u so that for 
t < 

\*U l m\ h o) - n^(o)| L2 < c(\h \ h2 + i) e -*i. 

In particular, if u(x,t,W) is another stationary solution on (— oo,oo) ; Qo C 
Q x Q, and T a fixed time, such that for any (W, W) G VLq and s G (— 00, T], 
n £ w(s, W) = U e u(s, W) and U h W(s) - U h W(0) = U h W(s) - U h W(0) then 
u(s, W) = u(s, W) for all s G (— 00, T) and almost every (W, W) G VL Q . 

In other words, the history of the modes with wave number \k\ less than 
combined with the history of the forcing increments on the remaining degrees 
of freedom is sufficient to determine the solution uniquely with probability one. 

The first statement in Theorem 3 is really a consequence of the contractive 
properties used to prove Theorem 4. It is the special case when the set of determining 
low modes is empty; hence, knowledge of the infinite past of the random forces is 
sufficient to reconstruct the state of the whole system. In general, as shown in 
Theorem 4, one needs some finite number of determining modes and knowledge of 
the random forcing applied to the missing modes to reconstruct the missing modes. 

We now give a more general result which implies the first part of Theorem 3 
by showing that to each realization of noise there corresponds a unique, stationary 
solution if the viscosity is large enough relative to the forcing. Another way of 
saying this is that the system's random attractor, whose existence was proven at 
any viscosity by Flandoli [Fla94], consists of a trivial diffusing point. Schmalfuss 
proved a similar statement using a random fixed point argument in the case of 
multiplicative noise and large viscosity [Sch97]. In that case, the attracting random 
solution is a random fixed point which does not fluctuate in time. 

Theorem 5. If C% < 1, then there exists a unique stationary random solution 
u*(t, W) defined for t G (—00, 00) and almost all W G Q. In addition, it attracts all 
other solutions exponentially quickly. [Mat98, Mat99] 

One of the interesting interpretations of Theorem 4 in the case of arbitrary 
viscosity is that on the set of stationary solutions one can define a functional $ : 
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C(— oo, 0; ILL 2 ) — > II^L 2 which reconstructs the high modes from the low modes. 
In particular if u is a stationary solution and rj = TlhW then define 

^(n^^o]) = lim <S>l (Il e u m ;h ) 

t — >— oo 

for some arbitrary fixed ho- Theorem 4 guarantees that the limit exists, that it is 
independent of the choice of ho, and that 11^(0, W) = $ r? (n£-U(_ OOj0 ]). With this 
result, we can close the low mode equations at the price of introducing memory. 
One obtains 



d£(x, t) 



dt + vA 2 £(x, t) + U e B(£ + ^(9 t £),£ + ^(9 t £)) = (9) 

where 9 t is the shift defined on £ by (9 t £)(s) = £{s + t) and rj by (9 t r])(s) = 
rj(t + s) — r)(t). This representation is closely related and inspired by the iner- 
tial form representation from inertial manifolds theory (cf. [CFNT89, EFNT94]) 
and the ideas of symbolic dynamics. From the representation in (9), it is clear why 
it might be reasonable to call systems satisfying the assumptions of the second part 
of Theorem 3 "effectively elliptic" diffusions. Under that assumption, the system 
reduces to an equation of the form (9). This Ito process with memory is elliptic in 
the sense that the noise directly agitates all of the coordinates. In contrast to the 
hypoelliptic systems considered in section 17, no detailed knowledge of the tangent 
space structure is needed. Once the assumption about all of the possibly unstable 
directions being forced is satisfied, only some soft general estimates are needed. 

When viewed in the context of Section 5, Theorem 3 might seem surprising. The 
theorem allows the associated diffusion to be degenerate in an infinite number of 
directions; yet the system has nice ergodic properties. Yet in other ways, Theorem 
3 is expected. It simply says that if all of the unstable directions are forced directly, 
the system is ergodic. Since the long time dynamics are governed by the behavior 
on the "unstable manifold" (if one was known to exist), forcing those directions 
destroys all possible obstruction to mixing in the phase space. Since these systems 
are non-autonomous, when we say that a collection of directions are stable, we really 
mean that all of the associated Lyapunov exponents associated with these degrees 
of freedom are negative. 



8. Ergodicity: General Constructions 

We now lay out a more general framework to make some ideas clear without 
being encumbered by specifics. In the next section, we also give a simple toy model 
and some illustrative examples which hopefully will make the ideas concrete. 

Let (X, | • | x , ( • , • ) x ) be a complete separable Hilbert space with a basis {e^}, 
k — 1, . . . . Consider the stochastic evolution equation 

* = GW + ~df ■ (10) 

taking values in X. Let T>(G) C X be the domain of G. For concreteness, we take 
W(x,t) = Yl a k e kbk{t) where are constants which fix the structure of the forcing 
and the bk{t) are standard variance one Brownian motions. More general forcings 
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built over a cylindrical Wiener space are possible with further assumptions, but this 
will be sufficient for our needs. 

We assume that the are chosen so that (10) has a globally defined stochastic 
flow (p^ t uo = u(t, W) where u(s) = uo- It is standard to associate with this flow 
a random dynamical system defined by the skew flow Q t (u,W) = ((p^ t u 0: 9 t W) 
(cf. [Arn98, Kif86]). Here 9 t is the shift operator. On noise paths the shift is 
defined by (9 t W)(s) = W(t + s) — W(t). We also define the shift of a trajectory by 
(8 t u)(s) — u(t + s). The difference in definition is due to the fact that in the first 
case we are really shifting the noise increments and not the path itself. 

Fix a positive integer iV*, and define the splitting of the space X = X^ x X h , 
by X^ = span{e fc : k < iV*} and X h = span{efc : k > iV*}. Let ILj and n ft be 
the orthogonal projectors onto X^ and X^ respectively. We will write u = {£, h) = 
(Heii, Hhu) G X^ x X^ and rj = TlhW and £ = U^W. Notice that the probability 
measure P decomposes into P^xP^. As before, we will denote segments of trajectories 
by an interval of time as a subscript. Hence, £[ s t ] is a trajectory in X^ between time 
s and t. We use II[ S)t ] to denote the projection of a path or set of paths onto the 
time interval [s, t] . 

One can always split the system into two coupled equations on X { x X/,, 

f r n e G( ( + h) + f t . (12) 

As in section 8.1, given this splitting, one can usually define a map h(t) = ^ t (£[ Sjt y, h ) 
which solves (11) given an initial condition h , noise path r], and £[ Stt ] viewed as an 
external input. Then for each to, ho, and rj, we can define 

^ = nK?(< + $^ w ;Afl)) + | (13) 
£(to) = to ■ 

Equation (13) is no longer a standard diffusion as we have introduced memory 
through the function 3>t 1 . It is critical to notice that £(t) remains an adapted Ito 
process and hence the power of stochastic calculus can be brought to bear. 

For the representation in (13) to be useful in the study of the ergodic theory of 
(10), the reduced dynamics (13) must "forget" the choice of ho- One way to investi- 
gate this is to study the system as t — > — oo. If the functional $ becomes indepen- 
dent of h , then we have a closed dynamics on C(— oo,0;X£) over the probability 
space Q. The resulting stochastic process could have infinite memory. Since it is 
defined by a compatible family of Gibbs measures, in [EMS01] it was dubbed "Gibb- 
sian dynamics" to be contrasted with Markovian dynamics. The ergodic theory of 
systems with this type of memory was explored in its own right in [Bak02, BM03]. 

Alternatively, one could study the measures induced on the infinite future for 
different choices of ho and show that they induce the same asymptotic dynamics in 
some sense. This was the point of view taken in [Mat02c]. 

The two approaches are more or less equivalent and each has its own difficul- 
ties. One difficulty of the memory/ Gibbsian Dynamics approach is that sometimes 
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the limit, lim t0 _ > _ oo t(^[t ,f]j ^o), om y exists on a restricted set of paths. In any 
situation where the approach works, one can always take i^^^, which are typical 
realizations of a stationary solution obtained by suspending any invariant measure 
over path space. But such a characterization is not constructive and at times is 
difficult to work with. 

At the most basic level, the success of the approach developed in [EMS01] (or 
[KSOO, BKL01] for that mater) hinges on treating the £ and h variables in fundamen- 
tally differently way. Since the £ variable is finite dimensional in all the situations 
we consider, all the difficulties of probabilistic calculations in an infinite dimensional 
setting, mentioned in section 4, are not an issue. In particular, the time t transition 
densities projected onto X^ will have densities relative to Lebesgue measure on X^ 
if all of the directions in X^ are forced. The analysis of the h variable is dynamic 
in nature. The analysis is done noise realization by noise realization. In contrast 
the analysis of the £ variable is probabilistic in nature. Arguments are made at the 
level of transition densities. If the system is strongly contractive, then the structure 
of the forcing is irrelevant. This was the fundamental fact used in [Mat99] to prove 
ergodicity by showing the existence of a distinguished globally attracting solution. 
Another way to say this is that the random attractor is trivial, consisting of a single 
point at each moment of time. Given our splitting, a similar structure remains in the 
h variable. As we will see, such contraction, ^-fiber by 77-fiber, is much less sensitive 
to the topology than are questions like the absolute continuity of measures. The 
basic idea is to change the measure on the £ variables in such a way that the re- 
maining degrees of freedom are contractive. The analyses in [Mat02c] and [EMS01] 
accomplish this by making the ts agree after some finite time. In [Hai02] , the mea- 
sure is changed to bring the £ (and h) together asymptotically at infinity but never 
at a finite time. In all cases, care must be taken so that the changes in the measure 
to not accumulate to the extent that the limiting measures become singular. 

To execute this program, we need to analyze the dynamics on the path space of 
X h an d understand the structure of the measures induced on the path space of X^. 
To this end, we make a few definitions. For allt > s > define 

Q t (£ , h Q , A) = F(£(t) G A\£(0) = £ , h(0) = h ) (14) 
Q [s ,t)(e , h , B) = F(e [s , t) G B\£(0) = £ , h(0) = h ) 

for Borel sets A C X* and B C C([s,t);X e ) = C([0,t - s);X t ). Notice we have 
associated C([s,t),Xf) with C([0, t — s),Xe) and will view U[ s j) as an element of 
C([0,t-s),Xe). 

Similarly for any realization of r], let t j be the cr-algebra generated by the 
increments of 7] between [s, t}. We define Q?(£q, h , A) = F(£(t) G A\£(0) = £ , h(0) = 
ho,J^o,t]) and Ql,t)( e o,h ,B) = P(£ M G B\£(0) = £ ,h(0) = h ,^ t] ). These 
are analogous to the previous measures except that we have conditioned on the 
realization of 77 over the time interval in question. Hence, for A cXg. 

Q t (£ ,ho,A) = J QW ,h ,A)F(dr j )=-EQWo,ho,A) . 
8.1. A Toy Problem 
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We now describe a simple toy problem which contains the main ideas needed to 
prove the results of the previous section. We will use the same notation to make the 
connections explicit. 

Consider the following two dimensional stochastic differential equation 

m^ Vlh+mh)+ ^ (15) 

^ = -v4 + F 2 (t, h) + aS. (16) 
at at 

Here z/j > 0, Oi > 0, rj and £ are standard one dimensional Brownian Motions on 

the probability space Q = C((— oo, oo); R 2 ). Hence, in the notation of the previous 

section X = R 2 , X^ = R, and X/j = R. We assume the following estimates hold 

I -Pi I + 1-^2 1 < K and \Fi(£, h) — Fi(£, h)\ < Li\h — h\. For the moment, we allow either 

or both of the Oi to be zero. Eventually, we will require only that a 2 > allowing 

<7i to be zero if desired. Since the Fj are uniformly bounded, it is easy to see that 

limsup^^ K[h 2 (t) + £ 2 (t)] is uniformly bounded over all initial conditions. From 

this, one can deduce the existence of an invariant measure using standard tightness 

arguments. The stochastic flow Lpf' v \£ , ho) and the functional ^ t (£[ Sjt y, ho) are 

defined as in the previous section. 

Subtracting two copies of (15) with the same 77 and £[ S;t ] but different initial 

conditions produces the estimate 

\®l t (£ [s , t] ; h ) - $^ M ; h )\ < \h - h \e-^~ L ^ . (17) 

Using this estimate immediately produces the following result, which is the analog 
of Theorem 4. 

Lemma 8.1. Assume ui > L ± . Given £ G C((— oo,0];R) and h G R, the limit 

^(A-00,0]) = Am *l (£ M ;ho) 

S— > — OO 

is well defined almost surely and independent of ho- Similarly, fixing a time in- 
terval [s,t] and initial conditions (£i(s) , hi(s)) . Let Q C x Q such that for all 
(6,^1,6,^2) e ^0 Vi( r ) -Vi( s ) = m{r) -772(5) and £x(r) = £ 2 (r) when r G [s,t] 
where (4(r) , h^r)) = ^'^(^(s), h t (s)). Then for all (6^1,6^2) e 

Mt) - h 2 (t)\ < Ms) - h 2 (s)\e-^- L ^ . 

Recalling that the shift 9 t on trajectories acts by (6 t £)(s) = £(t + s) and on noise 
paths by (0 t r))(s) = rj(t + s) — r)(t), then by Lemma 8.1 we can reduce the system 
to the following system with memory 

d l = - U2 £ + F 2 (£, ^\9 t £)) + ^f f (18) 
£(0) = £ 

where now £{t) is seen as an element of C((— 00, i]; R). Similarly our initial condition 
£0 is an element of C((— 00, 0]; R). 

We now turn to another auxiliary result which, along with the contraction em- 
bodied Lemma 8.1, is the linchpin on which ergodicity hangs. Recalling the defini- 
tions from (14), we have 
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Lemma 8.2. Assume a 2 > and v\ > L\. For all £o,h e K, the measure 
Qt{^o,h , ■ ) is equivalent to Lebesgue measure. For all £ , ho, ho, G R, i/ie mea- 
sure Q[o,oo)(^o ? ^o, • ) ^ s equivalent to <3[o,oo)(^o, ^o, • )■ ^or any realization ofrj, the 
exact same conclusions hold with Q t replace by Cf t and <5[o,oo) replaced by Qj 0oo y 

In the next section, we will use the Lemma 8.2 to prove the ergodicity of the 
toy problem (equations (15) and (16)). Of course, if a±,a 2 > then the system is 
uniformly elliptic and the fact that there is a unique invariant measure follows from 
standard elliptic theory. Even when U\ — 0, one might well use hypoelliptic diffusion 
theory to prove ergodicity. What we present here is a different possible route, 
where the detailed knowledge of the tangent space structure used in hypoelliptic 
arguments is replaced with assumptions about the system's Lyapunov exponents. 
The advantage of this route being that the contractive properties are less sensitive 
to the choice of topology than the measure theoretic properties of the system needed 
for the more standard approaches to ergodicity. 

The fact that <5[o,oo)(A, h , ■ ) and <5[o,oo)(^o, ^o, • ) are equivalent measures 
on the infinite time [0, oo) interval is critical. Absolute continuity on finite time 
intervals would not be sufficient. As an illustrative example consider the measures 
induced on path space by a standard Brownian motion B(t) and the SDEs 

dX{t) ym ,dB(t) dY{t) 1 dB(t) 

X(0) = x Y(0) = x . 

All three processes induce measures which are pairwise equivalent on any finite 
segment of path space. However, only the processes X and Y are equivalent on the 
infinite futures because their difference, |, is square integrable on an infinite time 
interval. See the proof of the second part Lemma 8.2 for the needed argument. In 
particular, we see that X(t) and Y(t) have the same asymptotic behavior at the 
level of the path space marginals, while W(t) has a different one. Notice that we do 
not mean that \X(t) — Y(t) \ — > as t — > oo. 

Intuitively it is clear why Lemma 8.2 when combined with the contractive esti- 
mate from (17), implies that there is only one invariant measure. From Lemma 
8.2, we see that any two invariant measures will induce equivalent measure in 
C(0,oo;X£). Hence they will charge trajectories with the same projection onto 
X^. This is already enough to ensure that the distribution on X e is unique. However 
because of (17) if the two paths share the same projection on to X e for all time 
the remaining degrees of freedom will also converge. Hence the time averages along 
some typical paths of the two measure will be the same. This implies the measure 
are the same. In Theorem 6, we make this argument precise. 

Proof of the first part of Lemma 8.2: We now prove the statements 
concerning Q t and Qj 7 . We need only to show that the measures, conditional on 
T), are equivalent since the full measures are simply the integration of the condi- 
tioned measures against the Wiener measure governing rj. We will use Girsanov's 
Theorem (cf. [Oks92, RY94] Ch 8, Thm 1.1) to compare (16) with the Ornstein- 
Uhlenbeck process % = — u 2 z + o~ 2 %- Girsanov's Theorem states that the two 
measures on path space are equivalent if a certain exponential martingale, which 
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gives the Radon-Nikodym derivative, is uniformly integrable. This is guaranteed by 
Novikov's criterion (cf. [Oks92, RY94] Ch 8, Prop 1.15) which, translated into our 

setting, becomes Eexp (j; J* * ^\F 2 (£, h)\ 2 ds^j < oo. Since by assumption 

Eexp Qjf ±\F 2 (e(s),h(s))\ 2 ds^ <expQ^t) < oo, 

we know that the measures induced on path space by ^[ 0jt ] conditioned on 77 and 
are equivalent. This in turn implies that the time t marginals are equivalent. Since 
the law of z(t) for fixed t is Gaussian and thus equivalent to Lebesgue measure the 
proof is complete. □ 

Proof of the second part of Lemma 8.2: We now prove the statements 
concerning <5[o,oo) an d Qjooo)- Again we use Girsanov's Theorem and only consider 
the conditioned measures. This time we compare the measures induced on [0, t] by £ 
starting from the same £ with the same rj but different fa's. In this case, Novikov's 
criterion becomes 

Eexp(i f \\F 2 (£,<S>l s (£ [0tS] ;h )) - F 2 (£,^l s (£ [QtS y,h ))\ 2 ds) 

V ^ JO °*2 7 

< exp(i f ^|$g,,(€ M ; h ) - ^(t^y, h )\ 2 ds) 

vz JO °~2 7 
^ Jo °2 ' 

< exp( 2 L2 \ho - h \ 2 ). 
V4a|(z/i - Li) ) 

Since the bound is finite and uniformly bounded in t, we conclude that the measures 
on path space are equivalent on the time interval [0, oo). □ 



8.2. Basic Ergodicity 

We now present some general theorems which we will use to prove the ergodicity 
of the SNS equations and the toy model. Hopefully, the assumptions will seem 
natural in light of the structure of the toy model. 

Assumption 1. There exists a set B C C(0, oo;X) ; with Pjy^^-Uo G £>} = 1 for 
all uq G X so that the following holds: 

If u(t, W) and u(t, W) are solutions to (10) and Q a subset of Q x Q, so that 

(u( ■ ,W),u( ■ ,W)eB 
(W, W) G VIq =>■ < U h W(t) - IW(0) = U h W(t) - U h W(0) for all t > 
I U e u{t, W) = U e u(t, W) for allt>0 



then 



n h u{t, W) - U h u(t, W) -> as t -> oo for all (W, W) G VLq. 
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In the toy model the set B was not needed; the conclusion held for all paths. 
This is not true in more general settings; we need to restrict ourselves to a set of 
"nice" paths. For the SNS equation, B will be the set of paths which grow and 
average in a typical fashion. Recall from (14), that Qt(uo, ■ ) and Q^ too ^(uo, ■ ) are 
respectively the measure induced on the "low modes" X e at time t by u(t) and on 
the path space C([0, oo),X^) by U[ t>0 o) if one conditions to use the noise realization 
i] and to start from the initial condition uq at time t — 0. 

Assumption 2. For all u = (£ ,h ) G X e x X h , Qt(£ ,h , ■ ) is equivalent to 
Lebesgue measure for almost every rj. For all u = (£o,h ) and u = {£o,ho) £ 
He x Xh, the measure Q[ 0oo \(£o,ho, ■ ) is equivalent to Qj Qoo \(£o,ho, ■ ) for almost 
every rj. 

As noted in the analysis of the toy problem, the equivalence of the measure 
conditioned on 77 implies the equivalence of the unconditioned versions. 

Theorem 6. If Assumptions 1 and 2 hold, then (10) has at most one X-valued 
invariant probability measure. 

By an X-valued probability measure /x, we mean a measure such that /i(X) = 1. 
Once this theorem is proven, we will have proven the ergodicity of the toy problem 
from the previous section. In the SNS setting, notice that Assumption 2 is close to 
Theorem 4. Lemma 13.1 makes the set B explicit. We now state a number of lemma 
which will be used to prove Theorem 6. 

Lemma 8.3. Assume Assumption 2 holds. For any pair of initial conditions u = 
(£o,ho) and u = (£ ,ho) G X^ x X^ and any t > 0, the measure Q[t,oo){£o, ho, ■ ) 
and Q[t,oo)(£o, ho, • ) are equivalent. Similarly for almost every i], Qj t ^(£0, h , ■ ) 
is equivalent to Q^ t ^(£0, ho, ■ ). 

Given any invariant measure [/,, we define two classes of associated measures; 
one on the future trajectories and one on the past trajectories. Let A4_ denote the 
natural measure on C((— oo,0];X) defined by cylinder sets of the type: for some 
^0) t±, ■ ■ ■ t n , to < t\ < t 2 ■ ■ ■ t n < 0, 

A = {(£(s),h(s)) eC((-oo,0],X),(£(U),h(U)) eA,i = 0,---n} 

where the A^s are Borel sets of X. The definition 

M-(A) = fi(Ao) ■ ¥{u(U) G Ai, i = 1 . . . ra|u(f ) G A }. 

characterizes the measure. Similarly we define A4 + on C([0, 00); X). We also define 
M\ on C([0,oo);X) by pushing \x forward under dynamics conditioned to use the 
noise realization 77. We will define a measure at the end of the section and 
explore it properties. Recalling that IL- was the projection onto X^; and hence, 
.M-1-IL7 1 is a measure on C([0, 00); X^). Analogously for t > s, we define II[ Sjt ) as the 
projection onto the space C([s,t); X). 
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Assumption 3. Let ji\ and ji 2 be any two invariant measures and let M.\ y i and 
MP + 2 be the measures induced on C([0, oo);X) described above. Then for almost 
every rj, Ai+^Uj 1 is equivalent to AA^^HJ 1 . 

Lemma 8.4. Assumption 2 implies Assumption 3. 

Proof of Lemma 8.4: Since \i is invariant, for any B C C([t, oo);X e ) and t > 

Mlnj\B)= I Q e [t -^(£ ,h ,B) f x(d£ xdho) 

the result follows from Lemma 8.3 and since 6 t is ergodic; hence, mapping one set 
of full measure in ILjfi to another set of full measure. □ 

Assumption 3, is weaker than Assumption 2. As the next lemma shows, it is 
sufficient to prove ergodicity. In some settings where solutions to the initial value 
problem do not have nice moment properties it is more convenient to work directly 
with stationary solutions. This type of analysis is presented in [BM03]. However, 
in systems like the SNS equations such difficulties do not arise and, as we shall see, 
Assumption 2 holds. 

In light of above lemma, the following result implies Theorem 6. 

Lemma 8.5. // Assumptions 1 and 3 hold, then (10) has at most one X-valued 
invariant probability measure. 

Note: If one was only interested in events which depended on the part of the 
path in X^ then Assumption 1 is not needed. 

Proof of Lemma 8.5: Since all invariant measures are a linear combination 
of ergodic measures it is enough to show there is a unique ergodic measure. Let 
lii and /j, 2 be two different ergodic measures. Let -M+,i, MP+ i, M.+,2 and -M+ 2 
be the associated measures defined above. Let <fi : X^ x X^ — > R be a measurable 
test function bounded with sup \<f>\ x < 1 and <f>(£, ■ ) G Lip 1 (X/ l ) for all i. The 
norm induced on measures by this class of test functions dominates the Wasserstein 
( or Kantorovich) distance for measures. Hence, this class of test functions is rich 
enough so that if / §d\i\ = J 4>dfi 2 for all such then fxi = /i 2 (cf. [Dud76] ). Since 
is invariant under the flow induced on measures, the Birkoff ergodic theorem 
implies that there exists sets Aj C C([0, oo), X) such that A4 +ji (Ai) = 1 and for all 
(£,h)EA t 

lim - / cf)(£(s),h(s))ds — 4>i — [ (p(x,y)fii(dx x dy). (19) 
t-oo t Jo J 

Define A { = A { fl B where E is the set from Assumption 1. Again remark that 
M+j(Ai) = 1 since B has full measure. Since M+j(Ai) = EM v + ^(Ai) = 1, 
Al+j(Aj) = 1 for almost every rj. Let A 1 } be a subset of Ai of full .M^j-measure 
so that the paths in A 1 } are solutions with a noise realization W so HAW = rj. By 
Assumption 3, M\ AAJ 1 is equivalent to M^^J 1 - Hence, .M+^rLj^n^) > 
which implies that HeA^ n He A\ is not empty since Ai+^e 1 ^!?^) = 1. Hence, 
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the set A = {{£,h\,h 2 ) : (£,hi) G A\ and (£, h 2 ) G A^} is not empty. Fixing some 
(£, hi, h 2 ) G A, from (19) and Assumption 1 we have that for any e > there exists 
a T so that for alH > T 



4>{£{s), hi(s))ds - <pi 



e 

<4 



and \hi(t) — h 2 (t)\ x < |. This last inequality holds because the hypotheses of 
Assumptions 1 are satisfied. Hence, 



101 - 02 



< 







e 2T 
- 2 + T + 4 / 



<f)(£(s), hi(s))ds — 
et-T 







+ 




t Jo 



- (t>{£{s)M{s))ds 

< e for t sufficiently large. 



Since e was arbitrary, the proof is complete. □ 

Proof of Lemma 8.3 : Let A C C(0, oo; X). We will show that 

Qj} >oo) (4, /*>, A) = implies that Qf t>oo) (4 &o, A) = 0. 

First notice that for 5 C n fe X if H?(£ , h , x, B) = ¥{h(t) G B\£(0) = £ ,h(0) = 
ho, £{t) = x, ?J> 0>t] } for t > then 



Qioo^ho, A) 



QU£o,ho,dx)H^£ ,h ,x,dy)Q^ oo) (x,y,A) . (20) 



Hence for almost every 77, Q7 too \(£o,ho,A) = implies that Q^Ax,y,A) = for 
Leb(dx) x H^(£o,ho,x,dy) almost every (x,y) because by assumption Q^{£o,ho, • ) 
is equivalent to Lebesgue measure. By the second part of Assumption 2, we know 
that QJ Q ^(x, y, ■ ) is equivalent to Qj 0oo ^(x, y, • ) for all x,y,y and for P-almost 

every fj. Hence, Q^^ix, y, A) = for Leb(dx) x H^(£ ,h ,x,dy) almost every 
(x, y) and P-almost every 77. (Here we have used that the shift is ergodic with 
respect to P. So 9 t a maps set of full measure to another set of full measure.) And 
hence, by the representation for Qj too ^(£o, ho, A) analogous to (20), we conclude 

Ql oo) (i ,h ,A) = 0. □ 



8.3. One Force, One Solution: Statistical Equilibrium 
Measures and Trivial Random Attractors 

In analogy to M.+, we define M.1 on C((— oo,0];X) as the limit as t — > —00 

of M? fi = El^ffojVl^loco]}- As discussed in [LJ87, DLJ88, Bax91], the sequence 
is a backwards martingale, hence the limit exists almost surely by the martingale 
convergence theorem. By ip^'^uo we mean the entire piece of trajectory on [t,0\. 

Similarly, one can define Ai^' 11 ^ on C((— oo,0];X) by the limit t — > —00 of ip^' n )/j,. 

This is the called the equilibrium measure [LJ87] and Ml = EiM^l^^}. In 
a similar manner, one can define Ai v on all of C{— 00, 00; X^). 
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In the case of the SNS equations Flandolli, Craul, and Debussche [Fla94, CDF97] 
proven the existence of a compact random attractor A(£, rj) C L 2 which attracts all 
bounded subsets B C L 2 in the sense 



hm d(<plf , B,A^, V )) = 

— * — rv-i 



where d is the symmetric Hausdorff distance on sets. 

If we define the action of the shift 9 t of rj as (9 t r))(s) = rj(t + s) —r)(t), the measure 
MP is invariant under the skew flow on measures fibered over rj. Then we have, for 
example, E{^f } M ri \F r '} = M 6tTl . (Recall that was the sigma algebra generated 

by 77.) Similarly, when a random attractor exists tp^^A^, rj) = A(9 t ^,9 t rj). 

We can consider the equation (11) in isolation over a probability space n^f2 x 
C(— 00,00; X^) with the measure F n (dr)) x M^YiJ 1 \d£) . In other words, we have 
elevated the part of the phase space C(— 00, 00; X^) to part of the base probability 
space. On this space the h(t) dynamics has the same property as the whole SNS 
equation under the extremely contractive assumption( iV 2 > Cfl). In particular, 
an analogous theorem to Theorem 5 holds: there is a unique solution h*(t;r),£) 
which attracts all other solutions. In these coordinates, the random attractor for 
the equation (11) is the single solution h*(t;r),£). Therefore, the invariant measure 
\i from above projected onto X h disintegrates into a delta measure concentrated at 
dh40; V ,e) against the measure F v (drj) x M v Uj 1 (d£). That is to say, if F : X^ x X h — > R 
then 



This is the analog for the partially dissipative system of the "one force, one solution" 
(i.e. trivial random attractor) discussed in [Mat99, EKMSOO, Sch97, LJ87, Mat02a, 
MY02, EVEOO] or exemplified by Theorem 5. A similar statement holds for the toy 
problem and all of the systems satisfying the assumptions in section 8.2. 

9. Contractive Nature of the SNS Dynamics 

The proof of ergodicity of the SNS under the assumption that only the "deter- 
mining modes" are forced will parallel the proof of the toy model. Our first step 
is to establish Assumption 1 in the context of the SNS. We do this by proving a 
quantitative version of Theorem 4 which was given earlier. 

To see what is involved we consider two solutions to equation (8), hi and hi, 
driven by a common low mode process i and noise rj. That is for t > s, hi(t) = 
®s,t(£[s,t)l hi(s)). Denoting Ui = £ + hi, we have 




dp{t) 



A 2 P (t) + n h B( Ul ,p) + n h B( P , u 2 ) 



dt 
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which using standard estimates on the nonlinearity (cf. [CF88]) produces, for some 
C > 0, 



2u \Ap\ 2 h 2 + (2C)2 |Ap| L2 |Am 2 | l2 |p| L2 < -v | Ap|^ 2 + - |Aw 2 |l 2 |p|J 2 

The above estimate then gives for t < t 

\p(t)\l <\p(t )\^exp(-uN?(t-t ) + ^ J \Au 2 (s)\l 2 ds^ (21) 

We now see the new difficulty which the Stochastic Navier Stokes equations present 
over the toy model. The contraction rate depends on the time average of the en- 
stropy |Am 2 (s)|l2 of one of the solutions. However, after we develop some estimates 
controlling this quantity the proof will proceed using standard ideas of localization 
from stochastic analysis. 



d \p\i? 
dt 



< 



< - 



10. The Energy and Enstrophy 

The toy model is an extremely uniform setting. The added difficulty in the 
SNS relative to the toy model, is the lack of uniformity. However, the standard 
idea of localization from stochastic analysis allows us to overcome this hurdle. As 
we saw in the last section, the growth of the energy |w| L2 and the time average of 
the enstrophy |Aw|l 2 seem to be of importance in controlling the uniformity of the 
contraction. This will be come clearer after the next two sections. We begin with 
some estimates on the energy and enstrophy. 

Lemma 10.1. E|u(i)|£a < e'^'^E |«(* )|^ + f£ (l - e' 2 ^-^) and for any p > 

I, E |«(f)|S < e -Mt-*) E | u (t )|g + Co Jl e-M*-.)E \u(s)\lt 1] ds. 

This implies that if one has a solution u(t, W) defined for t G (—00, 00) such that 
e -2H*lE \u(t, W)^ — > as t — > —00 then in fact E \u(t, W)^ is uniformly bounded 
in time. Using similar reasoning, one can show the following result. 

Lemma 10.2. Assume that p is an invariant measure such that there exist a 
U C L 2 with p{U) = 1. For any such measure stationary measure all energy 
moments are finite. In fact for any p > 1 there exist constant C p < 00 such 
that J h2 lulg dp(u) < C p for all invariant measures p. In particular, C\ — |£. 
Furthermore, f h2 \-Au\^ 2 dp(u) = ^ assuming only that S = J^Wkl 2 < 00 ■ If 
£1 = Y1 \k\ 2 \ak\ 2 < 00 is finite then the analogous statements hold for \Au\^ replac- 
ing \u\^2- In particular, J L2 |A 2 m|l 2 dp(u) = ^. 

Since one can construct a stationary solution from any invariant measure and vice 
versa (see section 8.2 ), this conclusion applies equally to any stationary solution. 
The proofs of Lemma 10.1 and 10.2 can be found in the appendix of [EMS01]. 
Related statements can be found in Chapter 3 and 4 of [Mat98], section 2 of [Mat99], 
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or the appendix of [Mat02b]. The moment estimates are just the stochastic analogs 
of deterministic estimates. Similar estimates from slightly different points of view 
can be found in [VF88, MR, BKLOO]. If one assumes, £\ = ^2 |<7fc| 2 |A;| 2 < oo then 
completely analogous statements can be made about the enstrophy. 

It is critical to our analysis to understand the typical size of the fluctuations 
of the enstrophy about its mean of Applying Ito's formula to the energy, one 
obtains 

d\u(s)\l? = -2u\Au(s)\l 2 dt + S dt + 2(u,dW(t)} L2 
If one writes the last term as 

\ k 

then the term in the square brackets is distributed as a one dimensional Brownian 
motion adapted to the filtration generated by the W increments. This motivates our 
definition of a Lyapunov function in the abstract setting (10), which is contained in 
the next section. 



.(EfcK^I 2 ) 5 . 



11. Growth and Fluctuations in A General Setting 

In this section, we put an abstract framework on the ideas of the previous section. 
In the section 12, we return to the concrete setting of the SNS. 

Assumption 4. There exists a function V : X — > [0, oo), with CqV(x) P0 > \x\^ for 
positive po, c 0; so that for a solution u of equation (10) V satisfies the ltd equation 

dV{u{t)) = g(u(t))dt + f(u(t))dB(t). 

Here B is a standard one dimensional Wiener process adapted to the flow generated 
by (dW) . ^ : 1U oo is a function satisfying 

g(u) <d- U{u) where U{u) > C 2 V{u) 

for some constants C±, Ci > and U : X — > [0, oo) U oo. Though U and g might be 
infinite onX, we assume ifu(t) is a solution to (10) on [0,t], then J* U(u(s))ds < oo 
almost surely and the above inequalities holds whenever U{u) < oo. And f : X — > 
R U oo is a function satisfying 

C 3 \f(u)\ 2 <U(u) 

for some C3 > 0. 

From the calculation at the end of the last section, in the SNS setting we should 
take V(u) = \u\ h2 , U{u) = 2u\Au\^ 2 , C\ = So, C2 = 2u, and C3 = 7^2. (We could 

have also used V(u) = \Au\^ 2 , U(u) = 2u\A 2 u\^ 2 if E\ = l°"fc| 2 |^| 2 < °°-) 
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Lemma 11.1. For any e G (0, 1) and K > ^ one has 



p , \/( M (t)) + (1 - e) ^ - V^o) - dt I / x 

P <( sup — i ^ > K } < exp ( — 2C 3 eK J 



t>o 



1 + log(l + t) 



Proof: Let M(t) denote the martingale J Q * f (u(s))dW (s) . Its quadratic variation 
[M,M](t) is f* f(u(s)) 2 ds. Since C 3 [M,M](t) < f*U(u(s))ds, by Ito's formula we 
have 

V(u(*)) + (1 - e) f U(u(s))ds - V(uq) - dt < M(t) - eC 3 [M, M](t) 
Jo 

The exponential martingale estimate implies that 

sup M(t) - eC 3 [M, M](t) > a] < e~ 2tC:ia . 



(22) 



Setting a = K[l + log(T)] one sees that probability of the event in the statement of 
the lemma is bounded from above by 

°° poo 

^exp(-2C 3 eir[l + log(n)]) < / exp(-2eC 3 K -2]og(x))dx = exp(-2C 3 eK). 

n=l J 1 

More details can be found in the proofs of the following related results: Theorem 
4-6 in [BM03], Lemma A. 5 and Lemma B.3 of [EMS01], or for the use of the 
exponential martingale Lemma A. 2 [Mat02c]. □ 

Using similar reasoning one can prove (see for example [BKL00, Mat02b]): 

Lemma 11.2. There exist positive constants 7 and K so that for all invariant 
measures \i with V(u) < 00 fM- almost surely, f exp(jV(u))dn(u) < K < 00 and for 
every initial condition u and t > 0, E exp^V (u(t))) < K exp(^V(u )) 

We now give estimates backward in time for stationary solutions. 

Lemma 11.3. Letu(t, W) be a stationary solution to (10) with V(u(t)) < 00 almost 
surely. There exists a K and a 7 > so that for K > K 



¥{ sup- 



V(u(t)) + (l-e) 


/ U(u(t))ds 
Jo 


- Ci\t\ 


l + log(l + |f|) 



> K ) < exp(-'jK) 



PROOF: The proof is essentially the same as that of Lemma (11.1). We write 



sup V(u(t)) + (1 - e 

te[T-i,T] 



sup 

te[T-i,T] 



U(u(t))ds 
V(u(t)) + (l-e) 



d\t\ < V(u(T- 1))+ 



U(u(s))ds 



-Ci|t|-y(«(T-l)) 
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By the previous two lemmas both of these terms have exponential moments uniform 
in T. Using the same reasoning as in the end of the proof of Lemma 11.1 completes 
to proof. □ 

In light of Lemmas 11.1 and 11.3 we define the following sets of "nice" trajectories 
which average well and grow in a typical fashion. Fixing some fixed e* G (0, 1), which 
will be set differently in different contexts, we define 



A n = \ u G C(-oo,0;X) : sup ' u ! < n\, 

< t<o 1 + log(t + 1) > 

B n = |u G C(0,oo;X) : 

V(u(t)) + (1 - ej | f f ^Kf)H - %) ~ Cit i 

sup L - ; — ■ <n>. 23 

t>o l + log(l + t) " / 1 ; 

The previous lemmas imply that, with probability one, any stationary solution is 
contained in UA n and the solution to any initial value problem is contained in UB n . 
From these lemmas it is clear that 

lim - / U(u(s))ds < Ci and lim -!- / U(u(s))ds < d 

t^oo t Jq t^-oo \t\ J t 

almost surely. This is the result analogous to the final conclusion of Lemma 10.2. 
But notice that Lemma 11.1 and 11.3 also give information about the size of the 
fluctuations. 



12. Strongly Contractive Case SNS: Proof of Theorem 5 

In the next section, we will take a more abstract point of view on the contractive 
nature of the SNS equation and other SPDE. However, first for illustrative reasons, 
we continue with the explicit calculations began in section 9 and use them to prove 
Theorem 5. 

PROOF OF THEOREM 5: We begin by proving uniqueness. Let us assume that 
there are two solutions u(t, W) and u*(t,W) defined for all t G (—00,00). Let 
p(t, W) = u(t, W) — u*(t, W). Both are governed by (3), so subtracting produces 

Ml = _ A 2 p (f) + B (u, p) + B(p, O. 
Using the same estimates on the nonlinearity as in (21), we obtain 

1 1 2 / 
d \py / (,.at2 C 



dt 



Notice that this is the same estimate as was obtained in (21). Since w* is a solution, 
we know that it is contained in some A n , where A n was the set of "nicely" fluctuating 
and growing paths defined in the last section. Recall that for the SNS: V(u) = \u 



U{u) = 2v |Au|l2, Ci = S . Hence, u G A n implies that for t < 



2 

L 2 ' 



u(t)\l 2 + (1 -e,)2i/ 



f |A«( 
Jo 



s)|l2 ds 



-S \t\ < n[l + log(|*| + 1)] 
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where e* G (1,0) is a free parameter which we will set momentarily. Continuing the 
estimation of p, using this bound, produces for t < 



\p(0)\h < \p(to)\l 2 exp (-vN?\t \ + - f \Au 2 (s)\ 2 h2 ds 

V v J t 

< |p(t )|^exp ( - 



uN?-C- S ° 



n[l + log(|* | + l)] 

t o\ + 



v 2 (l -e*)J' Ul 2.(1 -e*) 

Picking e* so vN% — C vi ^°_ e ^ = ^7* where 7* = i/iV^ — and using the assumption 
that the solutions are in A n to control |yo(to) II 2 yields the estimate 

< 4[^o + n(l + log(l + \t \)\ exp (-i 7 *N + ^ ^(f ~ 

Taking i — > —00, proves uniqueness. A similar estimate shows that the solution 
to any initial value problem converges exponentially forward in time to u . The 
existence can be deduced from the existence of a stationary measure; however, it is 
instructive to construct it directly, which we now do. 

Let u n (t) be the solution starting from initial value zero at time —n. From 
Lemma 11.1, we know that Q- n u n G Bk n for some k n . (6L n just shifts the path on 
[—71, 00) to a path on [0, 00).) In addition, we know that F{k n >n«} < exp(— ^n«). 

Hence, by the Borel-Cantelli Lemma, there exists an n* so that k n < n§ for all 
n > 7i*. Let n > m > M > 71*, then for M sufficiently large we have 



sup \u n (s) - u m (s)\ L2 < V] SU P \ u j+i( s ) ~ u j( s ) 
»e[-i,o] j=m+1 se[-i,o] 



II 2 



00 



< 2_. su p 

j=Af»e[-i,o] 



00 j 

< \ u j+iH)\v e M-^i*\j - 1| + b'l* [1 + iog(i + |j|)]) 

1 

< E P° + 2 ^'l"] ex p(-o7*b - 1| + b"l*[i + log(l + 



j=M 



j=M 



Since the last sum is less than C exp(— 7M) for some positive constants C and 7, 
the sequence is Cauchy and the proof is complete. □ 



13. Contractive Nature in the General Setting 

We now extract the essential assumptions of the previous section and present 
them in an abstract form. The choice of assumptions follows [BM03] which uses ever 
so slightly different assumptions, but proves more detailed estimates. In particular, 
statements about the continuity of the map $ are made. (See Theorem 9 of [BM03].) 
The treatment is also informed and influenced by [Hai02, Mat02c, EL02]. 
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Assumption 5. Consider the G : X — > X /rora (10). Assume that Assumption 4 
holds and that for all £ G X^ and h,h E X h , with £ + h, £ + h G T>{G), and some 
Ci > 0, and pi > 0, with c\ > C\(c 2 + c 3 ) ; 

_ 2 



+ /i) - G{1 + h), h - h) x < [ - ci + c 2 U(£ + h) + c 3 U(£ + h)] 



Il e G(£ + h)- U e G(£ + h) 



< C4 



1 + V(£ + h) Pl + V{£ + /i) Pl 



h-h 

P2 

h-h 



We give the analog of Theorem 4 and Lemma 8.1 in the general setting of equa- 
tion (10). This is a quantitative version of the determining mode result given in 
Theorem 4 and will be used to verify Assumption 1. 

Lemma 13.1. Let Assumption 5 hold. In particular, 7* = c\ — Ci(c 2 + c 3 ) > 0. Set 
the e* from the definition of A n and B n in (23), so that c\ — Cl ^+ C3 ) = 1^ and 
define c* = £ fz^ L - 

1. Fixing a T G (0, 00], let £ G C(0,T;X^) and ho, h G X^ and define u(t) = 
£{t) + $2 )t (V]5 ho) and u(t) = £{t) + $l t (e m ; ho). 

Assume that u G Tl[o,T)B n if c 2 > and u G il[ 0i T)-Bn «/c 3 > 0. Then for all 

te[o,T) 



< 



h - W 



ii + ty 



2. Fixing a T G (— 00, 0), let £ G C(T,0;X e ) and h , h G ~K h and define u(t) = 
£{t) + ®l t (£ [T ,t]; h ) and u(t) = £{t) + ^ T>t {£ [T ,t}] ho). 

Assume that u G ri[ Ti0 ]A n if c 2 > and u G U[Tfi]A n if c 3 > 0. Tnen /or a// 
tG[T,0] 



^(V,*]^o)-^(V,*]^o) 



< 



h Q - ho 



;i + m 



*e 2 



|7.(|T|-|t|) 



This theorem can be restated in terms of solutions to (10). 
Corollary 13.2. Let Q C Q x Q and c* and 7* are as defined in Lemma 13.1. 

1. Let u(t,W) and u(t,W) be solutions to (10) on (7(0,7; X) with T > 0. 7/ 
/or (W,W) G fl , owe /ias u(W),u(W) G % T]j B n and IW(s) - U h W(T) = 
U h W(s) - U h W(T) , Tl e u(s, W) = U e u(s : W) for s G [0,T] then 

u(t, W) - u(t, W) 2 < u(0, W) - «(0, WO 2 e nc *(l + t)« c * e -h** . 

X X 

2. lei u(i,W) and u(t,W) be solutions to (10) on C(T, 0; X) wito T < 0. // 
/or (Vy,iy) G tt , one has u(W),u(W) G Tl [m A n and U h W(s) - W{T) = 
U h W(s) - W(T) , U iU (s, W) = U e u(s, W) for s G [T, 0] then 



u(t,W) -u(t,W) < 2[d\T\ + n + nlog(l + |T|)]e nc *x 

(1 + |T|) riC *e~5 7 * (|T|_|t|) 
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Proof of Lemma 13.1 and Corollary 13.2: The proof of the two statements 
is almost identical and is simply an abstraction of the ideas in the proof of Theorem 
5 given in the last section. We give the details of the first statement. 

Let p(s) = $o s (£[o,s]] ho) — $q s (£[o, s ]; h ), then equation (11) and the assumption 
in the lemma and Assumption 5 imply that 

< [-c 1 + c 2 U(u(s)) + c 3 U(u(s))] \p(s)\l . 
Since u,uE il[ 0jT ]i? n , we have 

\p(t)\l < |p(0)||exp(-[ Cl - ^ +C3) ]t + n^±^(l + log(l + t)j) 

which proves the first result. The second result is just the same except that the 
estimates from A n are used. See the proof of Theorem 5. Corollary 13.2 is just a 
restatement of the theorem with the added observation that backwards in time the 
initial conditions h = Uhu(T, W) and h = L T / l -u(T, W) can not grow too fast since 
the solutions are in A n . □ 

Using the contractive properties backward in time one can define the limit 

lim Zio) 

for any t which is a projection of a solution u(t, W) on (— oo,0]. This limiting 
function, denoted $(£), is independent of h and can be used to reduce the dynamics 
to one on X^ with memory (i.e. Gibbsian dynamics). See [BM03] discussion of 
this in a general setting and [EMS01, EL02] for specific examples. If one endows 
C(— oo,0; X) with the metric \u\ r = sup f<0 j^w , then in many settings $(£) is 
continuous on the set of solutions u with £ = ILw. In fact under some simple 
assumptions, it is globally Lipschitz on each B n defined in section 11. In particular, 
both of these facts hold for the SNS equation. See [BM03] for more discussion of 
this. 



14. Ergodicity: the SNS and the General Setting 

We now turn to completing the proof of Theorem 3. All that remains to prove is 
the last part of Theorem 3 about the "essentially elliptic" dynamics (the case when 

> Cf§- ). We will do so by proving an ergodic theorem in the general setting of 
(10) and using the assumptions already introduced. 

To prove basic ergodicity in this case, we will use Theorem 6, which along with 
Corollary 13.2, contains the essential ideas from [EMS01]. Lemma 11.1 implies 
that almost every solution V 9 |o'oo)' u o is contained in a B n for some n. This, coupled 
with Lemma 13.2, is more than enough to imply Assumption 1 of Section 8.2 with 
B = UB n . We need only verify Assumption 2, to prove complete the proof. Since 
the author feels that techniques often used to verify the first part of Assumption 2 
are suboptimal, we leave it as an assumption for the moment. We will revisit the 
question at the end of this section. Hence, we introduce the following assumption. 
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Assumption 6. For allt > and (£ , h ) G X and almost every r], Qt(£o, h , ■ ) is 
equivalent to Lebesgue measure. 



The idea to prove the second part of Assumption 2 is again the idea of localiza- 
tion. By restricting ourselves to well behaved paths, we will be able to of obtain 
the needed result for a subset of the probability space. By relaxing the restriction, 
we can include arbitrarily large subsets of the probability space, implying that the 
conclusion holds with probability one. We prove the following result. 

Theorem 7. Consider equation (10). Let Assumption 4, 5 and 6 hold. In addition 
if Ok > for all k with \k\ G (0, N*), where iV* was used to define the splitting of 
equation (10), then the system has at most one invariant measure. 

Note: It is worth mentioning that existence of an invariant measure in our setting 
is usually straight forward. For instance, if the set {u : V(u) < M} is precompact for 
all M then the result follows easily by the standard Krylov-Bogoljubov construction 
of extracting a convergent subsequence from the empirical measures obtained by 
time-averaging. See for instance [CK97] for the SPDE setting or [CFS82] for general 
discussions. 

We begin the proof of Theorem 7 by proving the analog of Lemma 8.2 from the 
discussion of the toy model. In fact, we will only deduce part of it from our existing 
assumptions leaving the remainder still as an assumption. 

Lemma 14.1. Consider the solution to equation (10) under the assumptions of 
Theorem 14.1. For £ G X^ and h , h G X h and almost every r\, Q 1 ^ ^{£0, h , ■ ) is 

equivalent to Q^^^o, h , ■ )■ 

Proof of Lemma 14.1: Again we begin by essentially localizing to a fixed B n . 
However, we need to pick a set of paths in C(0, 00; X^). Fixing £ , ho, h and 77, we 
define 



We now compare Qj Qt -\(£o, h , ■ ; B' n ) to Q^A^o, h , ■ ; B' n ). Again we compare the 
measure using Lemma A.l from the appendix. By restricting to B' n , we ensure that 
both u and u stay in B n . Hence, the first part of Lemma 13.1 combined with the 
second estimate in Assumption 5 produces 



U £ G(u(t)) - U e G(u(t))\ 2 x < c 4 [1 + 2{dt + n[l + log(l + t)]) Pl ] x 

$^(n^ [0>t] ; ho) - $J t (n*u [0it] ; h ) 



B' n = \ £[o,oo) :u,ue B n where u(s) = £(s) + $ v s (£[o, s ], ho), 




Then for A C C(0, 00; X e ), define 



Q^ t] (£o,h ,A;B' n )=F{ 



u 



m G A; IL t u m G B' n \u(0) = (£ , h )] (24) 
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where 



$ o,i( n ^[o,t]; ha) - ^l t (U e u m ; ~h ) 



< 



h - h 



+ 



".c* e - 5 7*t 



Defining <7^ in = minifci < jv,, |o"fc| 2 , the previous two estimates imply that 



Jo 



\U e G(u(t)) - U e G(u(t))\ldt < < oo 



(25) 



(26) 



for some uniformly on B' n . Using Lemma A.l, we conclude that Qt(£o, h , ■ ; B' n ) 
is equivalent to Qt(£o,h , ■ ;B' n ). As in the previous part, since both u and u 
are in UB n with probability one, we conclude that Qt(£ ,h , ■ ) is equivalent to 

Q?(4A, •)• 

Looking back on the above proof, we seen that there was a great deal of unifor- 
mity in the estimates. When comparing Q^q(£o,ho, ■ ; B' n ) to Q^ q(£o, ho, ■ ]B' n ), 

we see that for all (£ ,h ), (£,h ) in a bounded ball, we can choose the same D*. 
From Lemma A.l in the appendix, we get the following result 



Lemma 14.2. For any M , there exists a so that if \£ + h \ 
then 



% + ho 



< M 



E 



\B' n ) 



alQl t] {£o,ho, ■ ;B' n ) 



< £>p(p-i) 



for all p > 0. 



Conclusion of the Proof of Theorem 7: In light of Lemma 14.1 and 13.1 
the result follows from Theorem 6. □ 

We now address Assumption 6. In the case of the stochastic Navier Stokes 
equations it is implied without further assumptions by the techniques used to prove 
Theorem 10 from section 17 since all of the directions in are directly forced. (The- 
orem 10 does not address the question of the marginal with respect to rj. However 
Theorem 10 follows from the fact that the Malliavin covariance matrix restricted 
to is almost surely invertible. This does imply the result for the marginals. See 
[MP03].) 

The same techniques should apply to most SPDEs of interest with additive noise. 
However, since an abstract version of the techniques in [MP03] is not written, we 
refrain from making any claims. There is however another approach. Though it is 
rather adhoc and in the author's opinion and "not the correct way," it is sufficient in 
many contexts. The basic idea is to compare the measures induced on C(0, t; Xg) by 
the process of interest and some well understood process both starting from the same 
point. This is done using Girsonov's theorem. Then the time t marginals of the well 
understood process starting from two different points are compared. By stringing 
the estimates together and making some additional assumptions needed to control 
the "high" modes in equation (8), one can prove Assumption 6. A particularly 
simple version of this was done in the toy model of the previous section. For more 
complicated versions see [EMS01, EL02, Mat02c, BM03]. [BM03] has a relatively 
crisp version of the argument. 
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15. Exponential Mixing and Coupling 

In this section, we expand the simple uniqueness results, given earlier in the 
paper, by giving a rate of convergence. The proof will be based on a coupling 
argument and is closer in packaging to the author's first proof of basic ergodicity 
which were presented in seminar talks 1 . We will measure the rate of convergence of 
(10) using the following metric. For any two measures /ii and n% on X define 

|| A* 1 — A*2 1 1 * = sup / (f)(x)iii(dx) - / (f)(x)ii 2 (dx) 
4>aQ, J J 

where Q* is the set of all measureable functions : X — > R with |0(«)| < 1 for all 
u G X and \(f)(£ + h) -(f)(£ + h)\ < \h-h\ x for all £ eX e and h, h G X h . Notice that 
the || • ||* norm dominates the Wasserstein or Kantorovich distance for measures 
but is weaker than the total variation norm. In the definition of G*, we could have 
also used test function which were Lip a on X^, with a > 0, and all of the theorems 
below would still hold. 

Now we make the following assumption which is a more qualitative version of 
Assumption 6. It amounts to continuity in the initial condition of the density induced 
on X^ at time t. 

Assumption 7. Fix any t > 0. For any M , there exist a positive 5 and Q' C II/jf2 
so F(i] G Q') > 5 and for any rj G f2' and v$ G X ; % — 1, 2, with V(u^) < M we 
have ±\\QU4\ -)-QM 2 \ ■)\\tv<1-6 . 

See Appendix B for the definition of || ■ \\ TV which may differ by a factor of 2 
from some definitions. Again this estimate can be obtained in a number of ways. For 
the SNS it was obtained by comparing, in a quantitative fashion, the total variation 
distance between the time t marginals and well controlled reference process (either 
Brownian motion or the SDE on X^ obtained from the Galerkin truncation of the 
SNS). However the author feels that this is not the optimal fashion to proceed. It 
would be better to use the flow property and the calculations from [MP03] to verify 
this estimate. Since the assumption has only been verified in specific cases, we leave 
it as an assumption. 

Letting P t (u , A) = ¥{u(t) G A|u(0) = u } where A C X, we have the following 
result whose proof give in the sections which follow. Stronger results using norms 
allowing test functions which grow are also possible by the methods presented here. 
Corollary 15.1 at the end of the section gives a simple, suboptimal example. See 
[MT93] or [MSH02] for examples to the type of stronger statements which should 
be possible. However [MT93, MSH02] does not apply to our setting. 

Theorem 8. If Assumption 4, 5 and 7 hold, then there exists fixed positive constants 
K and 7 so for all G X (possibly random, but adapted to the filtration at time 
zero) (see notes below.) 

\\p t (u£\ oii*<^[i+Ev^VEn4 2) )K 7t 

1 Stanford and Berkeley probability seminars November and December 1999. 
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We give the proof of this theorem in the next sections. In [Mat02c] a general 
theorem, ensuring exponential mixing in a wide class of problems, was given and the 
conditions were verified for the SNS. However, given the estimates of the previous 
section the exact same analysis applies to equation (10) when r\ = IlhW = 0. In that 
paper, the case rj = Tl h W ^ was discussed in another setting. A straight forward 
modification of the techniques from that paper yields the extension to 77 = U-hW 0- 
Kuksin and Shirikyan were the first to consider exponential mixing for the SNS in 
the case when 77 7^ [KS02]; however, their norm is slightly weaker. The norm we 
give here gives total variation convergence on a subset of the space which dictates the 
asymptotic behavior, namely X^. This allows on to use standard mixing results to 
get law of large numbers, central limit theorems, and other results. With additional 
work this also possible directly in the framework of [KS02] or [Hai02]. See [Shi02] 

In [Mat02c], the case rj 7^ was considered in a simple map example and we see 
here that those ideas extend to the SPDE context. In [BKL02], exponential con- 
vergence was proven but without the explicit dependence on the initial condition. 
That paper along with [Mat02c] were the first proofs of exponential convergence 
of the SNS with white in time forcing. In the kicked case exponential convergence 
was given in [MY02, KPS02]. The first of these also considers the the case where 
the system is strongly dissipative as in Theorem 5. In [Hai02], exponential conver- 
gence for a reaction diffusion equation was proved by bringing the paths together 
asymptotically using a coupling construction inspired by [Mat02c]. (Both [Mat02c] 
and [BKL02] were delayed considerable in the review process, and hence, [Hai02] 
appeared first.) 

To state a slightly stronger result, for any weighing function it! : X — > [0, 00) 
define \\fj,i — ^Wr* = su P0ee H * / (j>(x) ixi(dx) — f <p{x) ^{dx) where is the set 
of all measureable functions <j> : X — > R with \4>(u] 



](/>(£ +h)-(j)(£+h)\ < [l + R(£+h) + R(£+h)] 



h-h 



< R(u) for all u e X and 
for all £ eX e and h, h e X h . 



Corollary 15.1. In the same setting as Theorem 8, for any i + i = 1 with q,p > 1 

||P t (4 1} , -)-Pt(u£\ ■)\\ R ,<[l + (ER(u(t)y^ + (ER(u(t))^]x 

K' [1 + E\/(4 1} ) + EV(4 2) )] V 7 '* 

where 7' and K' are positive constants depending on p and q. 

Notice that if R(x) = V(x) then the assumptions of the corollary are satisfied 
and EV(u(t)) g < K"[l + E\/(-u ) 9 ] for some K" as V q is also a Lyapunov function. 
This Lemma is suboptimal as the right hand side does not scale linearly in V(u ) so 
a convenient operator norm is not induced. See [MSH02] for ideas, from the Markov 
setting, which likely could overcome this difficnency. 



15.1. Deconstruction and Reconstruction 

We begin with an overview of the coupling construction. The idea is to factor 
the measure induced on C(0, 00; X^) starting from uq and Uq and build a process on 
C(0, 00; X) x C(0, 00; X) so that the marginals are distributed as a process started 
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from uo and uq respectively and so that Heu(t) = Hgu(t) with positive probability. 
There is the added complication that we need to also have the processes use the 
same realization of 77 = TlhW and that we need to localize the trajectories to the 
nicely growing and averaging paths so that Uhu(t) — Il/ l -u(t) will converge to zero at 
a controlled rate. We begin with the localization. 

The B n defined in section 11 were sufficient for localizing to prove uniqueness. 
They also showed how typical paths stayed in a logarithmic envelope about the 
average behavior. However the probability from deviating from a given B n after 
time t decays slowly. To prove exponential convergence, we now localize with sets 
from which it becomes exponentially unlikely to deviate over time. For positive M 
define 



B 



(M) = G C(0, 00; X) : V(u(t)) + (1 - e*) U{u{s))ds - V(u ) 

<M + Ci(l + e*)t for all t > } (27) 



The constant e* is chosen so that C\ — j^Cxic-i + c 3 ) = ^7*. Recall that 7* = 
ci — Ci(c 2 + c 3 ) was assumed positive. Clearly ¥{u G U^ =1 B(M)} = 1 and 
(22), P{-U[ 0j t] G Ti]f) tt }B{M)\u G" B(M)} decays exponential in t. Furthermore 
given the choice e*, Lemma 13.1 (part one) holds with B n replaced with B(M) 
and different constants on the right hand side of the decay estimate. Precisely, if 
% = 1, 2) where u®(t) = £(t) + <$>\ t {t m ] hf) then for t G (0, T] 



u io,t) e n [0 , T )B(M) 



(2)x 



< 



h 



(1) 



h 



(2) 



e M e 



(28) 



Fix some M. For every (u§\ 11^,1]) G X x X x Il h Q, we define 

Bp^fafUS^) = {Vl e n £ X [0 , n] : u g n] G n [0 , n _i)B(M), < M 

where = £(s) + ^(l [0 , s] , nrf)} 

and define for A C n^X[ 0in _i) the measure 

^i,n]( u O' >l ;Ao,n]) = P ( n ^[i,n] £ A and n £ M [0in] G %„]|ti(0) = u ,^ n] } 

= E{l A iU e U [lM )l B{On] (U e U [OM )\u(0) = M ,^[o,n]}- 

where -B[o,n] = B[o,n](u^\ rj). Hence QJ ± n ^(u , A; B[o t n]) is the measure of paths 
so that n^[ 0i „] G -B[o,n] and n^-U[ l n ] G A if one conditions to start from Uq and use 
noise realizations W so n^H^ = rj. Of course, it is not a probability measure as it 
does not have total mass one. 

Given any two measures /ii and /J2, one can always write them as a density 
against a common third measure. That is fj,i(dx) = fi{x)jj,^{dx) for i — 1,2. We 
define the measures /ii A /i 2 and /ii — /i 2 respectively by the densities (fi(x) A 
fzix)) Hz(dx) and (fi(x) — f2{x))^{dx). It is easy to see that this definition is 
independent of the choice of /j 3 . If /ii does not dominate /i 2 for all measurable sets 
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then the second measure is a signed measure. See Appendix B more explination and 
the realtion to the total variation norm, which we denote by || ■ \\tv- 
With this notation define 

^(4 1} ,4 2) , •) = QJ,„)(4 1) , •;Ao,n))Agf lin) rf ) , • ; s [0>B) ) 

where again S [0;n ) = 5[ 0) „)(^ 1) , w[, 2) , 77). Next for A C X [0jTl ) and £ G ILjX[ 0)n _i) 
define 

i^(w ,,4|f) = P{w [0 ,n) G -A|w(0) = M ,n^[i in) = Vn-l),^[o,„)}- 

In words H^(uo, ■ \i) is the measure induced on X[ 0i „) by paths W[o,n) conditioned to 
start at uo, use noise realization r\, and such that u(s + 1) = £(s) for s G [0, n — 1). 

Next we define the two families of measures r n and s n , n G {1,2,. ..,00}, which 
will be critical in our construction. They will both be measures on ILjXr 0)n ) x 
rL?X[ 0in ) x ri/jfifo^) with n G {1,2,..., 00}. In general, we will use bold letters to 
denote measures on such spaces and capital bold letters for probability measures on 
such spaces. Define 

s n (uo, uo, du x du x drf) 

[Hl{uo,du\i) x H%(uo,du\e)M(uo,u ,de) x F(d V ) 



/ 



'[0,n-l) 

and 

r «+i( M o, u , du x du x drj) = [PiS n — s„ +1 ] (u , u , du x du x drf). 

Here Pis n is the measure on ILjX[ 0in+ i) x ILjX[ 0in+ i) x Uh^l[o, n +i) obtained by first 
stepping with s n and then with 

P n (uo,Uo,dU[o tn ) X diL[o,n) X <%),n)) = P[ Q n) (u , dU[ , n )) X P^ n) (u dU[ ,n)) X P(<%),n)) 

where for A C X [0 , n ) and P^ Qn) (u ,A) = P{w [0in ) G A\u(0) = u ,J^ 0n) }. That is to 
say, 

PlS„(w ,M0,dW[0,n+l) X dU[o, n +l) X ^[0,n+l)) = S n (u , U , dU[ , n ) X dU[o,n) X (%),„)) X 

Pi(u(ra),«(n),du[ n)n+ i) x du[ n) „ + i) x d77[ n>n+1 )). 

Define ri(u ,ito, • ) = Pi(u ,u , ■ ) -Si(u ,uo, • )• Since Pis„(w , «o, -4) > 
s n+i(wo, We A) by construction for all measurable sets A, r n is a standard measure 
and not a signed measure. Lastly, we define 

Pn{u ,U ) = S n (u , M ,X[ 0i „) X X[ , n ) X IL^^n)) 

for n > (including n = 00) and p — 1 and the probability transition kernels 

c / \ s„(m ,mo, • ) td / ~ \ r n (u ,u , • ) 

S„(m ,m , • ) = 7 R n {u ,u , ■ ) 



p n (u ,U ) " ' P„_i(m ,M ) - p n (u ,U ) 
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If the denominator is zero in either of the above definitions, we set the corresponding 
measure to the zero measure. Observe that 

p n (u ,u ) =E^(tto,«o,II/X[o )n _i)) 

= 1 - ^E||Qf ljn) («0, • 5%n)) - Q[ ltn) (uO, ■ ;%n))||TV > 0, 

where B[Q, n ) = -B[o,n)(wo, u , i]). This holds even for n = oo, since for all n the 
measures are absolutely continuous for almost every r\. This can be seen by the 
same calculations as in the proof of Lemma 14.1 coupled will Lemma 8.3. Also 
observe that Pn(uo,u ) > p n+ i(u ,u ). Thus, we have 

1 = PO > Pi > ■ ■ ■ > Poo 

For all M sufficiently large, we will see in Lemma 15.2 that Poo(u ,u ) > for all 
u ,u with V(uq), V(u ) < M . 

From the properties of s n and r„, one had Pi(u ,u , ■ ) = Si(u ,u , • ) + 
ri(w ,Wo, • ) and 

p 2 = p lSl + P iri = s 2 + [P lSl - s 2 ] + n 

M 

= s 2 + r 2 + ri 

where we have suppressed the dependence of the kernels on the initial conditions u 
and Uq. By = we mean that the two measures have the same relevant marginals. 
More precisely if we consider the kernel at the point (m ,«o), the joint distribution 
of the first and last coordinate of both sides is P^^(u , ■ ) x F(dr]) and the joint 
distribution of the second and last coordinate of both sides is P^ 02 ^(u 0: ■ ) x F(dr)). 
Continuing along this line and normalizing the measures to probability measures, 
produces the following version of the factoring lemma from [Mat02c] . 

oo 

Poo(«0, UO, • ) = PooSoo + ^[Pn-l ~ P„]PooRn (29) 

71=1 

where PooR^ is analogous to PiS n from above. On the right hands side we have 
suppressed the dependence on w and uo in the interest of space. That is Sqo = 
SooKb^o, • ), Pn-i = Pn-i(uo,u ) and so forth. 

Such a factorization of the futures was also fundamental to the results in [Hai02]. 
Since the project of this measure onto the first and last coordinate of both sides 
equals ^(wo, • ) x IP(^ ? 7[o,oo)) and the projection on the second and last coordinate 
of both sides is P^ ^(uq, • ) xP(^?7[o,oo))) we have built a representation of two copies 
of the process which both use the same r\. The first is distributed as a solution 
starting from u and the second as a solution starting from u . This representation 
has the the following importantly feature. There exists a set A C X[ 0jOO ) x X[ 0)OO ) x 
n^il[o,oo) so that Soo(A) = 1 and if (u,u,r)) G A then U.£u(s) = Heu(s) for all s > 1, 
m [i,oo)j ^[1,00) ^ B(M), and u, u are solutions for some noise realizations W and W so 
RhW = ^-hW ■ These are precisely the conditions needed to apply the contractive 
estimates from section 13. 
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This factorization states that drawing from is equivalent, as far as either 
u(t) or u(t), is concerned, to drawing from Soo with probability p^ and PooR„ with 
probability p n -\—Pn- Of course, we have built in useful correlations between the two 
processes. Also notice that Poo appears on the left hand side, so the factorization 
can be iterated. 

15.2. Estimates on the /?'s 

The following estimates on the p are the principle information needed to prove 
the exponential mixing, other that the Lyapunov structure which will be described 
in the next section. The first estimate is enough to imply mixing. The fact that the 
spacing between the p's decays exponentially, combined with the exponential tails 
of the return time to the set C defined in the following section, give the exponential 
mixing rate. 

Lemma 15.2. In the setting of Theorem 8, let B(M) be the set used to define -B[o,n) 
in the previous section. For any M > the following estimates hold for all M large 
enough: 

1. There exists a positive constant p*^, depending on M and M , so that 

inf Poo(4 1) ,4 2) )>P^>0 

u^:V(u^)<M Q 

2. There exist positive constants K\ and 71 , also depending on M and M , so 
that for all uf , i = 1, 2 with V{uf) < M , 

pn^^u^) - pn+^u^,^^) < fTiexp(-7in). 
The proof of this lemma will be given in section 15.7 . 

15.3. Consequences of the Lyapunov Structure 

We now make a modification in the presentation relative to [Mat02c] which is 
greater than notational (but still mainly cosmetic). We want to iterate the expansion 
(29). However we will only have nice control over the p„'s for u ,uo in a particular 
subset of the phase space. Hence, we modify the expansion to include the steps 
needed to return to this subset. 

As already mentioned under Assumption 4, a lemma analogous to Lemma 10.1 
holds for the Lyapunov function V. From this it is straight forward that there exists 
an a e (0,1) so that E{V(u(t + 1))|^} < aV{u(t)) + d. Hence, if we define 
V(u,u) = V(u) + V(u) then 

E{V(u(t + l),5(t + l))|f t } < aV(u(t),u(t)) + 2C 1 . 

We define the set C = {(u, u) : V(w, u) < ^} and the stopping time 

r c = inf{s > : s E N; (u(s),u(s)) E C}. 
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Lastly set M = ^ and fix M, from the previous two sections so the conclusions 
of Lemma 15.2 hold. The importance of this choice of Mq and hence the definition 
of C are given by the following result. 

Lemma 15.3. Under Assumption 4, P{fc(wo, uo) > n} < Ko^qV{uo,uo) for any 
7o G (a, 1) and some positive K = Kq{^). 

Proof of Lemma 15.3: This result can be found many places. See for instance 
Lemma 11.3.9 of [MT93], Lemma 9.3 of [MSH02] or in the continuous time setting 
and in the context of the SNS Lemma 3.2 [EM01]. □ 

15.4. Coupling: A New Representation of the Process 

We will define a new presentation of the chain using the factorization (29). First 
however, we modify the factorization slightly. In light of the previous section, the 
process (u(t),u(t)) returns to the set C infinitely often at integer times almost surely. 
Let P*(wo? Uo, ■ ) be the distribution of (ur 0ro i, u\o, TC ]i Vio^c]) wnere ( u (0), u(0)) = 
(uo,u ). Then P # (m(0), u(0), • ) is a probability measure on \ where 

oo 

X = [J X[ 0) fc] x X[ 0j fc] x Yl h Q[ 0jk ] . 

k=0 

The case k = is added to cover the situation when (u ,u ) £ C already. Since we 
only want to use the previous factorization for (uq, uq) € C, we redefine p n (uo, uo) = 
for (uo,uq) C and set S n equal to the null measure for (uq,uq) ^ C. Hence 
for (uq,uo) ^ C, ri = Pi and all other r n are then the null measure. The result is 
that for (u ,u ) ^ C, the chain takes a step of length one with u and u stepping 
independently. 

Returning to the general case (uq,u ) G X x X. Defining R n * = P*R n , the 
factorization (29) can be rewritten 

Poo(«0, Uq, ■)= PooSoo + [1 - Poo] ^TZ. ~ PcoR "*' ( 30 ) 

n=l P °° 

Again we have suppressed the dependence of the right hand side on the initial 
conditions. Defining 

T3 ( ~ \ Srlpn-l(u ,Uo)-pn(u ,U )] _ 

Roo4mo,Mo, • ) = > J ; 7 r-v R n *(uo,u , ■ ), (31) 

^ l-p oo (M ,M ) 

we consider the chain X n = (x n , x n , r/ n ) on the state space x — X u (X[o,fcoo) x X[ 0jOO ) x 
n^fi[o,oo)) given by taking steps from probability transition kernel 

Poo(wo,Mo)Soo(wo,Mo, " ) + [1 ~ Poo (u , W )]Roo* («0, Uq, • ) . (32) 

We define 

n— 1 n— 1 

tn ^ ^ ^ ^ \%k\ 

k=l k=l 
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where \xk\ is the length of the trajectory segment Xk- t n is the time passed in the 
physical PDE setting after n steps of the chain have passed. Since the chain adds 
segments of random length on each step, t n is a random quantity. Similarly asso- 
ciated to X n is a trajectory (u(t),u(t)) of the SPDE. It is defined by (u(t),u(t)) = 
(x n (t — t n ),x n (t — t n )) where t n is the unique tk such that tk < t < tk+ \xk\. We will 
use both notations depending on which is the most convenient. We are, of course, 
only interested in X n through the step when \x n \ = oo. This happens the first time 
a segment is drawn from S^. For reasons that will be clear, if they are not already, 
we refer to this as the "coupling time." We define the stopping time 

r = inf{n : \x n \ = oo}. (33) 

We pause for a second to notice some of the properties of the chain we have 
built. On the first step if (u(0),u(0)) (jL C, it takes one step, adding a piece of 
trajectory of variable, integer length according to P*(/u(0), u(0), ■ ). Hence, at the 
end of this step, the system is in C. Henceforward each step starts and ends in 
C. With probability 1 — Poo the chain draws from Roo*. Each of these paths is 
of finite length. Their statistics are discussed below. With probability a path 
of infinite length is drawn from S^. After one unit of time, paths draw from 
are, by construction, contained in B(M). In addition by construction, they have 
norm at time one less than M, use the same 77 increments, and agree on for 
t > 1. Since at time one the norm is less than M, we have an a priori bound to the 
separation in the high modes. Thus, if (v, v, rj) is drawn according to Soo, then from 
(28), \v(t) - v(t)\x = \ U hv{t) - n h v(t)\ x < Me M exp(-i 7 «t). 



15.5. The Heart of the Convergence Result 

We now show how the previous two sections quickly give the needed estimates 
to prove Theorem 8. For e G* one has 

!#(«(*)) - = EM«(f)) - mmiKn + v<§] 

< 2F{t T >h + Me M exp(-^ 7 „t). (34) 

The first term in the estimate follows from <fr(u{t)) — <f>(u(t)) < 2 and El iT>i / 2 = 
P{t T > t/2}. The second term follows because for t > 2t T the system has been 
following a trajectory drawn from for at least t/2 units of time. Hence, 

\U h u(t) - U h u{t)\ x < Me M eM-\l*\) 

as noted in the previous paragraph. Next observe that t T (uo,Uo) = tc{uq,uq) + 
t T (u TC ,u TC ) where t T (u ,u ) and t T (u TC ,u TC ) means the stopping time starting from 
initial conditions (u ,u ) and ( ) respectively. Hence, 

F{t T {u , u ) >n)< F(r c (u , u ) > 77) + sup F(t T (u' , u' ) > -) . (35) 

We know from Lemma 15.3 that F(tc{uq, Uq) > |) is exponentially decaying in n 
with a constant which scales linearly with *V(u ,u ). Hence, Theorem 8 would be 
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proven. If we show that sup F(t T (u' , u' ) > |) decays exponentially in n. This is 
done in the next section. 

The proof of Corollary 15.1 follows from similar reasoning. 

m<t)) - = ntwt)) - mm ikh + ^ 

<ER(u(t))l tT>t _+ER(u(t))l u>t _ 

+ Me M exp(-- 7 ^) [1 + ER(u(t)) + ER(u(t))} 

< [(E(i2(u(*))*))5 + (E(R(u(t)) q ))"} [ntr > \}Y 

+ Me M exp(-i 7 ,t)[l + (ER{u{t)) q )\ + (ER(u(t)) q )^. 

Hence and exponential bound on P{t T > |} will also complete the proof of the 
Corollary. 

15.6. Moments of the Coupling Time 

We now complete the proof of Theorem 8 by providing exponential control of the 
moments of t T . The missing pieces are the following lemma, which we will proven 
at the end of this section and some estimates on the p's given in the next section. 

Lemma 15.4. There exist positive constants 72 and K 2 so that for all (uo,uo) G 
C ; Eexp(7 2 |R o*('Wo, uo)\) < exp(i^ 2 )- Where |R 00 *(-uo, uo)\ is the random variable 
distributed as the length of a segment drawn from R 00;(t (-Uo, Uo, ■ ). 

Using this lemma we quickly finish the proof of Theorem 8. For any a G (0, 1) 
and (uo, ito) G C 

¥{t T (u , Uo) > n} =¥{t T > n; r > an} + F{t T > n;r < an} 
<P{r > an} + ¥{t T > n;r < an} 
<(1 - p^) LanJ + e' (r2 - K2a)n (36) 

where 72 and K2 are the constants from Lemma 15.4 and from Lemma 15.2. 
The first estimate follows because on each step of the chain there is at least a p*^ 
chance of drawing from S^. Accepting the second estimate for a moment, choosing 
any a G (0, 1 A ^) gives exponential decay and completes the proof. 

To see the second estimate, observe that from Lemma 15.4 and the fact that 
(uo,uo) G C, Eexp (72X^=1 \ x k\) < exp(ani^2)- Hence one has 

an 

F{t T >n;r<an}< F{J2 W\ > n; } < e -^- K2a)n 

k=i 

Proof: Proof of Lemma 15.4 Let |R„*| be the random variable distributed as the 
length of a trajectory drawn from TL nJf (uo, Uq, ■ ). In what follows, we suppress the 
dependence on the initial conditions (uq, Uq) of the p's and the transition kernels as 
we always consider the same initial conditions. 
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Define the random variable ( as follows by 

( = k with probability for k e {1, 2, . . . }^— ^ — — . 

J- Poo 

Then |Roo*| is distributed as |R<*|. Hence, we have 

n n 
P{|R 00 *| > = P{|R C *| > n; C > - } + P{|R C *| > n; C < ^ } 

<P{C>-} + P{|R c *|>n;C<-}. 

The first term decays exponentially by the second part of Lemma 15.2. This leaves 
only the last term. 

o A^ n T^V^ m ri™ ^Pk-l-Pk 

IV. 



k=i 

Notice that |Rfc*| is k plus the time to return to C starting from (u(k),u(k)). Using 
Lemma 15.3 and that by definition [pk-i — Pfc]Rfc = r fc produces 

ii 

P{|R ( ,|>n;C<-} 



< 



1 2 f 

— / ^i T c(u(k),u(k)) > n - k}R k (du t du)\p k -i - p k \ 

1 po ° k=i J 



< 



K 



J^7" k I y(u(k),u(k))r k (u ,u ,du [0>k) ,du [0;k) ). 



1 " A» fc=i 

By the definition of one sees that for any measurable set A, P k (uo, uo, A) > 
r k (uo, uq, A). Since V is positive, we have 



/ 



V(u(k),u(k))r k (du [0jk) ,du [0:k) ) < J \(u(k),u(k))P k (du^ k) , du [0 , k) ) 

= E{V(u(k),u(k))\(u(0),u(0)) = (u ,u )} 
< K" since (uo,Uo) £ C. 



The uniform bound on the integral used to obtain the last estimate comes from a 
lemma controlling V completely analogous to Lemma 10.1 about the energy of the 
SNS. It can be found in many places. It is simply integrating up the Lyapunov 
estimate in time. See for instance Lemma 9.3 of [MSH02] or Lemma 11.3.9 of 
[MT93]. Continuing, one has 

n K 1 K' 

P{|R C ,| > n; C < -} < — - Y.^' k < -T—-^ 

z 1 Poo fc=1 1 Poo 

□ 
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15.7. Proof of Lemma 15.2 



Proof of Lemma 15.2: The details of a similar argument are on page 452 of 
[Mat02c]. We begin with the first statement. For any M > and A C X^, define 

Q v t (u ,A; M) = P(II*u(t) G A; V(u(t)) < M\u(0) = uo,J% tt] ) 
= E{l A (tt e u(t))l V ( u{t)) < M \u(0) = Uo,J^ 0tt] }. 

Since sup Uo .y( U() ) <Mo K{V (u(t))} < oo, for all M sufficiently large one has 

inf F{V(u(t)) < M5} > 1-5/10. 

u :V(u )<Mo 

Hence there exist a SI" C SI' so F(rj G Si") > 5/2 and for all rj G SI" and uf G X, % = 
1, 2, with V{uf) < M , one has \\Q v t (u ( o \ • ; Ms) - Q?( U ( 2) , • ; M5)|| Ty < 1 - 5/2. 
Now define r"(uo,«o, • ) = QJ 0jOo) {uq, • ;« G £(M)) A Qj, j0o) («o, • 5« G ^(M)). 



Then 



5 2 

Poo(«o,«o) > -r inf EP(tio,«o ) n { I [()i0o) x n f x [M x n f fl [0i0o) ) 
4 noi u o€A,n^M =ri£Mo 



l/(u ),%)<M5 



Since (28) holds in this setting, the exact same calculations as in the proof of the 
second half of Lemma 14.1 hold producing an estimate identical to Lemma 14.2 
with B' n replaced by {u G B(M)} and valid for uq,uq with V(uq), V(uq) < Ma. 
Combining this estimate with Lemma B.l, we obtain for any p > 1 



inf Er^(w ,wo,ILX[o,oc) x n^Xr ,oo) x n^r ,oo)) > 

Mo,uoGX,n^no=n^uo 

V{u ),V(u )<Mh 



1 

1 - - 

P. 



C{M)T- 



pp- 



where C(M) = inf^^^i EQJ^K UeX [0tOo) x n,X [0)Oo) x n^ [0)OO ); « G B(M)) 
and -D* is the constant defined analogously to (26). 
Notice that 

E Q[o,oc)KlLX [0 ,oc) x U e X [Q>00 y,ue B(M)) = P{u [0tOo) G B(M)\u(0) = u }. 

Hence for M sufficiently large, for all u with V(u ) < there exists a set SI'q C 
n<fi so thatPfa G Q' ") > 1-5/100 and for all rj G fig' P$ >oo) («o, £(-&0) > 1-5/100. 
Hence C(M) > (1 - 5/100) 2 . This completes the first claim. 

Now consider the second claim. Setting Y n = n^X[o iW ) x n^X[o in ) x n^fi[o, n ), 
notice that p n _i(it , uo) - Pn(u ,u ) = r n (u ,u ,Y n ) = [PiS„_i - s n ](u , u , Y„). 
From this we see that p n _i — p n is the probability of drawing from s n _i but not 
from s n . There are two ways this can happen. First the trajectory can leave the 
set B(M) between time n — 1 and n. This probability is exponentially small in 
n by the construction of B(M) and the estimate in (27). The second way is to 
draw from the part of distribution contained in B(M) between time n — 1 and 
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n but not in the common part of the two Q v distributions. Over [0, n — 1] tra- 
jectories (u,u,i]) are drawn from s n _!. Hence almost every trajectory has the 
properties that n^U[i in _i] = ILj-U[i in _i] and both are in il[o in _i]-B(M). The contrac- 
tive property derived analogously to (25) then implies, \u(n — 1) — u(n — l)| x < 



Me M e"5>( n - 1 ). 



Let B 



[n— l,n] 



U 



[o,n-i),M[o,n-i),^[o,n-i)) be the paths in n^X [n _ 1)n] 



x 



n^X [n _i in] x n^ [n _ 1)n] so that when added to (u[ , n -i], «[o,n-i], V[o,n-i]) the result- 
ing path (u[ 0l n],W[o,n]>»7[o,n]) is such that «[o,n]» W[o,n] € n [0jn ].B(M). (As before the 
part of the trajectory in Hh&[n-i,n] has to be reconstructed with the aid of $.) 
Hence Q v [Q l] (u(n - 1), • ; £[„_!,„]) and Qjj^i^ra - 1), • ; £[ n -i, n ]) where B[„_i,„] = 

-B[n-i,n](^[o,n-i], ^[o,n-i], ^[o,n-i]), are the two distributions which will be used to draw 
the next unit length step. Thus the term we need to control is 



-E||Qj ))1] («(n - 1), • ; B [n _± M ) - Qj Q1] (u(n - 1), • ; £ [n _ lin] )|| Ty 



< E E 




dQj 01] (u(n-l), • ) 
dQl^uin-l), •) 



*[n-l,n] 



< 



The main estimate comes from the last estimate of Lemma A.l applied on the 
measure conditioned on a fixed rj path. The estimate exp(i^e~2> n ) is the estimate 
on the constant used in Lemma A.l. This estimate is a consequence of the 
contractive property noticed above use to estimate the difference term 



exp 



t 2 . 

mm 



\U t G(u(t)) -n e G(u(t))\ldt 



in a fashion analogous to (26). 



16. Other Examples 

The general assumptions used in the previous example are general enough to 
cover a number of SPDEs of interest. A natural second example where all of our 
analysis applies is the stochastically forced Cahn-Allen/Ginsburg-Landau equation 

du(x, t) = [uAu + u-u 3 ]dt + dW(x, t) (37) 

where W(x,t) = J2k e k( x ) a 'kP(t), are independent standard Brownian motions, 
cr k are positive constants and e k are the elements of the real Fourier basis 

{1, sin(27ra;), cos(27rx), sin(47ra;), cos(47rx), • • • }. 

See [BM03, EL02] for the verification of the assumptions. (Note that text assumes 
that Xft is not forced; however, the verification of the assumptions given there allows 
one to cover that case with the theorems provided in this text.) One uses the 
Lyapunov structure V{u) = U{u) = |u|£ 2 + |Vw|l 2 . That case is also analyzed in 
[Hai02]. In that reference, the strong contractive nature is used to get an exponential 
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mixing rate uniform in the initial data. This is because the time for the initial return 
center of the phase space does not depend on the initial state; this is not the case 
in the SNS equation. This holds because one can estimate the time Tc uniformly in 
the initial data. Hence from (35), one sees that the mixing time can be estimated 
independent of the initial data. This is made explicitly in the theorems in [Mat02c]. 
Another noteworthy feature of the analysis in [Hai02] is that a change of measure 
is made in the low modes to steer all of the modes together only asymptotically. In 
contrast to the presentation given here where the i variable is made to be equal for 
all moments of time after t — 1 and the h variable converges asymptotically. The 
method in [Hai02] appears to be simpler to construct while the method exposed here 
gives convergence in a slightly stronger topology. 

In [EL02] other examples are given, all but the stochastic Kuramoto-Sivashinsky, 
fits directly in the framework given here. The stochastic Kuramoto-Sivashinsky 
equation requires localization ideas not based on a straight forward Lyapunov func- 
tion. The details are explained fully in [EL02]. 

17. True Hypoellipticity and the Cascade of Randomness 

It is reasonable to ask if the results given in Theorem 3 or Theorem 7 are sharp. 
Does ergodicity require forcing all of the modes below the scale specified by the bal- 
ance between energy influx and dissipation ? The assumption for the second part of 
Theorem 3 is an ellipticity assumption about the dynamics in the typically unstable 
directions. Equivalently viewed from the Memory/Gibbsian dynamics point of view, 
it means that the reduced system with memory (9) is elliptic. 

While there is no complete proof, there are a number of results which seem to 
imply that much weaker conditions are sufficient. They all describe the dynamics 
in a hypoelliptic setting; the case where all of the typically unstable degrees of 
freedom are not forced directly. In this setting, ergodicity and mixing require that 
the nonlinearity transfer the randomness to other degrees of freedom. 

The first result given below proves the ergodicity of an arbitrary Galerkin ap- 
proximation of (4) under very weak assumptions. Under similar assumptions, the 
second result says that the full PDE has a transition density whose finite dimensional 
marginals have a density with respect to Lebesgue measure. 

A third result by [Rom02] proves the geometric ergodicity of the Galerkin pro- 
jections of the three dimensional SNS equations. This was expected as the structure 
shares the needed structure with the two dimensional problem. What was extremely 
interesting and novel in that paper was the proof that the system was globally con- 
trollable. A fourth result found in [AS03] shows that the full two and three dimen- 
sional SNS equations are controllable in the sense that one can steer them so that 
any finite number of modes take specified values. This is very similar in spirit to 
Theorem 10 where only projections of the transition measure are shown to have 
a density. The techniques used to prove the control results in [Rom02] and [AS03] 
seem to use the same important observation. Namely that the off-diagonal nature of 
the nonlinearity leaves the system globally consolable even though its nonlinearity is 
even powered. We refer the reader to [Rom02] and [AS03] for the precise statement 
of the results. 
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As we will undertake direct calculations, it is simpler to work in a real basis of 
L 2 (T 2 ). For this reason we switch our forcing to the form 

W(X, t)= C0S ( k " X ) b k(t) + ^ Sin ( k ' X ) B k(t) (38) 

keK cos ketc sin 

where B k and 6 fc are independent real Brownian motions with variance one, cx£ os , 
a s k m are positive real constants, and /C cos , /C sm are subsets of Z 2 = {j = {31,32) G 
Z 2 : J2 > 0, 1 j I > 0}. We need only to consider Z 2 as the reality of the vorticity 
allows one to restrict to wave number in the upper half plane and we have assumed 
the absence of a mean flow. (Note: In [EM01] the sums were restricted too much, 
however this does not effect any of the bracket calculations made and the results 
hold true.) 

We now define two sequences of subsets of Z 2 which capture how the randomness 
spreads from one degree of freedom to the next. Define /Co = Zq = /C cos fl/C sin . Next 
define 

Z n = Z n ^ u{keZl-.ke {£ + j,£-j,j - £} with j eZ ,£e Z n ^ 

and £^.j^0,\j\^\£\} 

and fixing some positive integer N define 

C = C-i u {k g Z 2 : k e {£ + j,£-j,j-£} with j,£e C-i 

and e ± .j^o,\j\ ^ \e\,\e-j\<N,\e+j\<N} 

and finally Z^ = UZ n and JC^ = U/C^. The two sets track the cascade of ran- 
domness out to the unforced modes. The farther along the chain which a mode 
first enters the sequence of sets, the less the random variation will be felt in that 
coordinate. 

Theorem 9 below will state its assumptions in terms of /C^ whereas Theorem 
10 will use Z^. It is likely that for a given /C cos and /C sin that Z^ = U^/C^ (one 
direction is clear) however proof is not immediately obvious. Furthermore, a sketch 
of Theorem 10, under the same assumptions as Theorem 9, is given in [MP03]. 
Hence, we do not think there is any real significant difference between the two sets. 

The first result we state gives exponentially mixing for the order N Galerkin 
approximation of (4) with forcing of the form (38) provided an algebraic condition 
on the wave numbers forced, given in terms of /C^, is satisfied. By the Galerkin 
approximation of order N, we mean the finite system of coupled ODEs obtained by 
setting to zero, for all time, any Fourier mode with \k\ > N. This approximation 
returns us to the setting of standard hypoelliptic SDE in IR d . Using a weak version of 
Horomander's sum of squares theorem (cf. [KS84, Nor86, Bel95] ), it was shown that 
the diffusion has a smooth C°° density. Then, using some standard Markov chain 
theory for a Harris chain with a Foster-Lyapunov function, one obtains exponential 
mixing. 

Theorem 9. [EM01] Consider the order N Galerkin approximation of the vorticity 
equation (4). Assume that /C^ = {k G Z 2 : \k\ < N}. Denoting the solution by oj n , 
one has the following mixing result. 
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If uJq and uJq are two initial conditions then for any p > 1 there exist positive 
constants B = B{p) and 7 = j(p) so that 

\\P t (u?, ■ )-P t (utf, • )\\tv < \\Pt(otf, • )-P^o, ■ )lk 

<B[l + \^\ 2p + \^\ 2p ]e-^ 

Here \\ ■ \\ TV is the total variation norm on signed measures and || • \\v(p) the 
weighted variational norm defined by 

\\P t (^, ■ ) - P t (Q», • )lk = E<f>(u; N (t)) - E0(c^(t)) 

with V p = {measurable with \(f>(x)\ < 1 + |x| 2p }. Taking oj^ distributed as the 
invariant measure, one obtains exponential convergence to the invariant measure 
and uniqueness of the invariant measure. 

To make this theorem interesting, we need some examples of conditions on /C cos 
and /C sm so that it applies. The following Lemma gives simple conditions under 
which the previous and next theorems hold. 

Lemma 17.1. [EM01, MP03] 

• //{(0,1), (1,1)} or {(1,0), (1,1)} C /C cos n/C sin thenZ^ = Z 2 and Kg = {k G 
Z 2 : \k\ < N} for any N . 

• Let M,K G N with M, K > 2 and \M - K\ > 2. Then if 

{(M + 1, 0), (M, 0), (0, K + 1), (0, K)} C Z 
then Zoo = Z 2 . // in addition M, K < N - 1 then = {k G Z 2 : \k\ < TV}. 

This gives only two examples of types of forcing which are sufficiently distributed 
to ensure ergodicity. Many others choices are possible. The author thanks A. 
Majda and P. Constantin for stimulating conversations which pushed him to verify 
the second part of Lemma 17.1. It provides an example of forcing which allows 
one to observe both the energy and enstrophy cascade which are present in two 
dimensional fluid systems. Of course, the most interesting question would be to 
make some qualitative statement connecting this cascade of probability with the 
dynamics. 

A theorem similar to Theorem 9, but for the three dimensional Galerkin approx- 
imation, is proven in [Rom02] . There he proves even more; he shows that the system 
is actually globally controllable. This very interesting fact hinges on the observation 
that because the nonlinearity is off-diagonal in Fourier space; and hence, the system 
has the good properties of systems with odd powered polynomials nonlinearities (see 
[Jur97]). 

Theorem 9 gives a strong indication that a similar theorem holds for the full PDE; 
however, a proof currently alludes the community. The following theorem shows 
that at least one of the needed ingredients persists for the full infinite dimensional 
vorticity equation. 
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Defining 



= Span ({ sin(A; ■x):keZ OQ U Z™ s } U { cos(A; • x) : k G 2 M U Z° in }) 

we have the following density result for the finite dimensional marginals of equation 
(4). 

Theorem 10. [MP03] For any t > and any finite dimensional subspace S of 
Soo, the law of the orthogonal projection Usou(t,-) of iv(t, •) onto S is absolutely 
continuous with respect to the Lebesgue measure on S. 

This of course is not enough to prove ergodicity. It addresses only the first part 
of Assumption 2. 

18. Open Questions 

A number of open questions have been mentioned in the text. Here we collect 
them and add a few more. 

1. Extend the ergodic results to the case when all of the determining modes are 
not forced. The results on the ergodicity of the Galerkin approximation suggest 
strongly that full PDE is ergodic under weaker assumptions than Theorem 3. 
Theorem 9 gives and indication what the proper assumptions should be. The 
results on the existence of densities for the projection of transition densities 
and the controllability of a finite number of variables gives strong evidence 
that nothing surprising happens in the full PDE. 

2. Prove (or disprove) that even when the forcing has spatial Fourier modes which 
decay super-exponentially, the solution still decays only exponentially in \k\. 
Prove (or disprove) that this decay rate does not fluctuate with time in the 
stationary state. 

3. Related to the previous: "What is the natural topology of the transition den- 
sity of the Markov process defined by the SNS ?" 

4. Extend Theorem 3 to the full space. The case of bounded domains in the same 
as the periodic case. However the full space requires some additional ideas, if 
not completely different ones. 

5. Understand better the v — > limit. In a recent preprint [Kuk03] explores 
this limit for one choice of forcing. However, the choice of scaling produces a 
deterministic limit which is the less interesting case and does not correspond to 
the traditional view of turbulence. In all cases, there remain many interesting 
question concerning the structure of the limiting solutions and the limit when 
other types of forcing are used. 

6. Make progress in the three dimensional problem. Unless a breakthrough is 
made in the deterministic three dimensional problem, this would likely require 
other methods. The methods used here proceed in a pathwise manner in 
the high k and, hence, can do no better than the deterministic theory. In 
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particular, the estimates used to get contraction of the high k are similar to 
those used to prove uniqueness of solutions. Recently Da Prato and Debussche 
have show that by a selection principle one can build a stochastic process 
associated to the 3D problem and that this process under certain conditions 
has a unique invariant measure. Unfortunately the conditions on the forcing 
require it to have algebraic decay in \k\. 
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A. Comparison of Measures on Path Space 

Suppose that we have stochastic processes X^(t), i — 1,2 on the path space 
C([0,T],X) where X is some Hilbert space and T G (0, oo]. Furthermore, assume 
that satisfies the equation 



Here, for fixed t the functions j\ and f 2 map the space CWi = C([0, £], X) to X. By 
AT[ 0)t ] we mean the segment of the trajectory on [0, t}. W(t) is a cylindrical Brownian 
motion over a Hilbert space Y and g is an invertible Hilbert-Schmidt operator from 
Y — > X. For any B C C[ 0i t], define measures • ; B) on the path space as: 



In this setting, we have the following result which is a variation on Lemma B.l 
from [Mat02c] and follows quickly from Girsanov's Theorem. Similar versions of 
this lemma can be found in [MS03] and [BM03]. 

Lemma A.l. Assume there exists a constant G (0, oo) such that 



dX®(t) = fi(t,X® t] )dt + gdW(t), t G [0,T] 
X^\0)=x . 



(39) 



P$ n (A; B) = P{X$ T] G A n B}, for A C C [m . 



Define also D(t, ■ ) = h(t, ■ ) - f 2 (t, ■ ). 




T 



0-^(*>*&)l>}M*$]) 



(40) 



44 



almost surely for i = 1,2. Then the measures P^ T ^( ■ ;B) and P^' T ^( 
equivalent. In addition for any p > 



,(2) 



And lastly 



-IIP, 



(i) 



[0,21 1 



E 



dP, 



(i) 



[0,T] * 



B) 



dP (2) ( 



(2) , 
POT 



;B)||w< U 



(1) 
0,21 



^[O.T] ( ' ) 



; Z?) are 



< {Dl - 1) 



Proof: Define the auxiliary SDEs 

where £>(£) = G C[ 0; t] : 3a; G £> such that x(s) = x(s) for s G [0,t]}. Solutions 
Y^(t) to these equation can be constructed as 

Y®(t) = Xt{t)l {t < T} + [gW(t) - gW(r) + X«(r)]l {t>T >. 

Here r = inf{s > : X® s] B(s)}. 

Denote D B (t,x) = [fi(t,x) — f 2 (t, x)]l B ^(x). The assumption on D in (40) and 
the definition of B(t) imply that 

6XP {^/ \9~ lD B{t,X m )\ 2 Y dt^ < D* a.s. 

under both measures Pyj Q t , defining solutions to auxiliary equation with i — 1 and 

i = 2. Hence, Novikov's condition is satisfied for the difference of the drifts of 

dP {1) 

the auxiliary equations and the Girsanov's theorem implies that — ^^-(x) = S(x) 

dP Y[0,t] 

where the Radon-Nikodym derivative evaluated at a trajectory x is defined by the 
stochastic exponent: 

E(x)=exp|^ (g~ 1 D B (s,x [0>s] ),dW(s)) Y - ^ jf \g~ 1 D B {s, x [oA )\\ds^ . 



Note that restrictions of measures Pif 1 . on the set B coincide with P 



[0,t] 



>(<) / 

[0,t]< 
>(2) 



This proves that P^( • , B) is absolutely continuous with respect to P[q 2 )]( • ;#)■ 
The reverse relation follows by symmetry and the proof of equivalence is complete. 
To prove the second estimate, notice that 

(E) p = exp ip [ {g~ l D B (s, x M ), dW(s)) Y - p\ I \g- 1 D B {s, x %s] )\\ds 



o 







E p exp [^Y^J o Ig-'D^x^lUs ) < E p D p J p -V 
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where E p is the martingale defined by 



E p = exp|p^ (g 1 D B (s,x [0>s] ),dW(s)) Y - y J \g l D B (s,x %s] )\\ds 

Hence, EE P = 1 and in light of the estimate on E p , the proof is complete. To see 
the last estimate, use the Cauchy-Schwartz inequality to obtain the first inequality. 
The expand the square and use the fact that the Radon-Nikodym derivative is a 
martingale with expectation one to obtain the bound (E(^|^-) 2 — 1)5. Applying the 
previous estimate to the square gives the result. □ 



B. Coupling Estimates 

For any two probability measure \i\ and /i 2 on a space X, we can always write 
them relative to a common measure v so that d[ii = ipidv. Then we define the 
measures (/ii A 7/2) ( • ), (fJ>i — A*2) + ( • ), and (/i 2 — A t i) + ( ' ) respectively by the 
densities (ipi A ip 2 )dv, {jpi — ^ 2 ) + dv, (ip 2 — ipi) + di' where a A b = min(a, b) and (a) + 
is a if a is positive and zero otherwise. Notice that = (/ii A /i 2 ) + (/ii — /U 2 ) + . 
Also observe that if || • \\tv — f l^i — ty^\dv is the total variation norm then 
f - /Mtv = 1 - (A*i A A*2)(X) = (a*i - /U 2 ) + (X) = (/i 2 - /U!) + (X). The proof of 
the following lemma can be found in the appendix of [Mat02c]. 

Lemma B.l. Let /ii and fi 2 be two measures on a space X with /i«(X) < 1. Assume 
that Hi is equivalent to ji 2 and that there exists a constant C > and p > 1 so that 



then 




Notice that this lower bound is strictly positive if /ii(X) > (or equivalently /i 2 (K) > 
0). 
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Errors/Typos corrected since original version: 

12/03 Fix misplaced "a" in Assumption 7 on p. 28. Fix missing power of 2 in definitions of £ a and £\ on p. 2 and p. 20 
respectively. Correct omitted restriction to T>(G) on p. 9 and p. 24 and associated rewording of Assumption 4 on p. 10. 
Clarify assumptions on <P on p. 10. Replace cosmetics with cosmetic on p. 33. 

2/04 Fix direction of inequality in Lemma 15.3. 
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